Seeking an experienced AI/ML engineer for a one-time on-site setup of a high-performance local LLM system. This project involves configuring and optimizing a large open-weight language model (LLaMA 4 – 70B) for use in a secure, offline private research environment. Responsibilities will include: • Installing and configuring LLaMA 4 (Maverick version) locally on a high-performance Ubuntu system with RTX 6000 Ada GPU • Setting up token streaming or prompt-response architecture using vLLM, Ollama, or similar inference stack • Building a lightweight FastAPI (or CLI) interface for model interaction • Implementing logging of inputs/outputs to disk in JSON or plain text • Assisting with setup of a local embedding model (e.g., MiniLM or BGE) for vector search/memory recall Requirements: • Prior experience running large models locally (13B–70B) • Familiarity with GPU inference and memory optimization (without quantization) • Strong Linux skills (Ubuntu CLI) • Security-first mindset; must respect that the system is fully airgapped • Ability to communicate clearly and implement from spec Nice to have (not required): • Familiarity with LangChain, LangGraph, or agent orchestration frameworks • Knowledge of inference schedulers, token streaming, or routing logic Project Details: • Estimated time: 1–1.5 working days total • Compensation: Rate negotiable — please include your typical hourly or day rate when applying • Location: Must be available to work on-site in South Bend, IN • Security: NDA will be required
Keyword: Software Development
Price: $45.0
We are seeking an experienced Twilio specialist to help us enhance our DIY customer service phone system. Our goal is to achieve greater flexibility and improve our customer interactions. The ideal candidate will have a solid understanding of Twilio's capabilities and h...
View JobCOMPANY DESCRIPTION We are a VoIP (Voice Over IP) telecom company. We provide digital phone services for over 17,000 phone lines across the United States. Our infrastructure (consisting of VMWare hosts) resides in 5 data centers running both development and production w...
View JobHi we have a new relaxation app that works in the background while you do other things and looking for beta testers to test it and review it. It's very simple you just tune into how you are feeling before starting. Then start it and do whatever you want after. it works ...
View Job