AI Infrastructure and Experience Engineer
AI Infrastructure and Experience Engineer
Our Client - Information Technology & Services company
- Mountain View, CA
Job description
Our Customer is a Silicon Valley-based company that is engaged in researching emerging technologies.
We are seeking a contract AI Infrastructure and Experience Engineer to help support our Customer's business needs. This role is on-site in Mountain View, CA.
Responsibilities:
- Deploy and tune multiple LLMs and generative multimodal models on local inference hardware.
- Optimize inference performance metrics such as time-to-first-token (TTFT) and tokens per second through model quantization, caching strategies, and architecture-specific optimizations.
- Leverage CUDA to build custom kernels for maximum GPU utilization.
- Integrate inference backends with orchestration layers such as LiteLLM and Ollama, and frontends such as OpenWebUI.
- Rapidly prototype high-fidelity demos showcasing model memory, agentic workflows, and context-aware web search.
- Implement communication protocols to connect local AI compute with peripheral devices such as smart TVs, household appliances, and XR hardware.
Skills and Qualifications:
- Bachelor’s degree in Computer Science, Machine Learning, Artificial Intelligence, or related field preferred.
- Minimum 3 years of relevant industry experience.
- Recent experience in model optimization.
- Proven experience with NVIDIA ecosystems and ARM64 architecture.
- Advanced proficiency in Python, C++, and Rust.
- Deep expertise in CUDA, including custom CUDA kernel development and debugging.
- Experience with modern inference engines such as Llama: cpp, TensorRT-LLM, and Ollama.
- Experience with orchestration frameworks such as LiteLLM.
- Strong understanding of asynchronous programming using FastAPI.
- Experience with containerization technologies such as Docker and Kubernetes.
- Experience designing APIs for low-latency communication.
- Experience building frontend UIs using React, Next.js, or similar frameworks.
- Familiarity with WebSockets, gRPC, and REST for device-to-device communication.
- Strong problem-solving skills.
- Ability to work in fast-paced, rapidly changing environments.
- Strong architectural thinking with the ability to design AI-driven consumer experiences.
Preferred Qualifications:
- Builder mindset with the ability to rapidly create proof-of-concept solutions.
- Strong adaptability and agility in experimental environments.
- Experience with sandbox environments.
We offer a competitive salary range for this position. Most candidates who join our team are hired at the median of this range, ensuring fair and equitable compensation based on experience and qualifications.
Contractor benefits are available through our 3rd Party Employer of Record (Available upon completion of waiting period for eligible engagements)
Benefits include: Medical, Dental, Vision, 401k.
An Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability.
All applicants applying for U.S. job openings must be legally authorized to work in the United States and are required to have U.S. residency at the time of application.
If you are a person with a disability needing assistance with the application, or at any point in the hiring process, please contact us at support@themomproject.com.