Overview
About Us
We're building the next generation of voice AI — where LLMs don't just read and write, they listen and speak. We need an engineer who deeply understands both LLM architectures and audio systems to build seamless audio-to-audio experiences.
Tech Stack
Core: Python, PyTorch, HuggingFace Transformers
Speech/Audio: Whisper, Wav2Vec2, Coqui TTS, ESPnet, librosa, torchaudio
LLM Infra: vLLM, TensorRT-LLM, ONNX, Triton
Audio Codecs: EnCodec, SoundStream, DAC
Vocoders: HiFi-GAN, Vocos, BigVGAN
Infra: Docker, Kubernetes, AWS/GCP, Redis, Kafka
What You'll Do
LLM Optimization & Integration
Optimize LLM inference for real-time voice applications (latency, throughput, memory)
Integrate audio encoders/decoders with transformer-based language models
Implement streaming inference pipelines for conversational AI
Fine-tune and adapt LLMs for speech-aware tasks
Audio-to-Audio Systems
Build end-to-end speech-to-speech pipelines (ASR → LLM → TTS)
Develop real-time voice transformation and conversion models
Implement neural audio codecs for speech tokenization
Design low-latency (<300ms) duplex conversation systems
Voice Synthesis & Processing
Build/optimize TTS systems for natural, expressive speech
Implement neural vocoders for high-quality audio generation
Design phoneme-level models and G2P systems
Develop voice cloning and speaker adaptation capabilities
What We're Looking For
Must Have
5+ years in ML/AI engineering with focus on speech or audio
Deep understanding of transformer/LLM architectures and how to optimize them
Hands-on experience with speech models (Whisper, Wav2Vec2, or similar)
Experience building TTS or ASR systems in production
Strong Python + PyTorch skills
Understanding of audio fundamentals (spectrograms, mel-filterbanks, sampling)
Good to Have
Experience with neural audio codecs (EnCodec, SoundStream, DAC)
Familiarity with LLM serving (vLLM, TensorRT-LLM)
Background in real-time audio streaming (WebRTC)
Published work or open-source contributions in speech AI
C++/CUDA for performance optimization