1200000 - 2500000 INR - Yearly
Space Exploration & Research, Information Technology
Full-Time
Krim
Overview
**Work on real-time inference, validation, and production voice AI latency.
Krim is building KrimOS — the autonomous agent operating system for banking.
We are hiring an AI engineer to make AI systems trustworthy in real-time voice.
At the core of KrimOS is KendraOS — a pre-execution validation layer that checks AI actions before they execute. Your role is to make this inference and validation stack fast enough for production voice AI.
What you’ll work on:
- Inference runtime optimisation
- Validation execution engine design
- Real-time voice AI latency reduction
- Parallel validator execution
- Model routing and serving
- Production monitoring for latency and validation performance
Relevant stack includes:
- vLLM
- TensorRT-LLM
- SGLang
- AWQ / GPTQ / FP8
- Continuous batching
- Speculative decoding
- Real-time inference systems
What we’re looking for:
- 1–3 years of relevant experience
- Voice AI or real-time inference experience preferred
- Strong Python and systems programming mindset
- Experience shipping ML systems to production
- Familiarity with open-source model serving frameworks
- Strong interest in inference optimisation and latency engineering
- AI coding tools as part of daily workflow
Why join:
- Work on a difficult, differentiated problem
- Direct founder access
- High ownership very early
- Meaningful equity
- Real production deployments in banking environments
To apply, email nath@krim.ai with subject:
“Inference Engineer — [Your Name]”
Please include:
- Brief note or short video on what you’ve built
- Your AI coding workflow
- GitHub or public inference-related work
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in