Bangalore, Karnataka, India
Information Technology
Full-Time
Bajaj Finserv
Overview
Location Name: Pune Corporate Office - Mantri
Job Purpose
We are seeking a dynamic AI/ML Engineer to join our pioneering voice Gen AI R&D team. The ideal candidate will possess a strong foundation in machine learning and a passion for innovation. This role involves developing advanced voice AI solutions.
Duties And Responsibilities
Research and Innovation: Stay abreast of the latest advancements in Gen AI/ML technologies, contributing to research initiatives and applying innovative solutions to practical problems.
Generative AI & Model Optimization
Model Selection & Customization
Building a bot that doesn't just answer but negotiates with human-like reasoning.
Running large models (LLM/STT/TTS) in low-latency, low-bandwidth environments without cloud dependency.
Understanding caller emotions in noisy, multilingual conditions (anger, hesitation, sarcasm).
Ensuring STT and TTS pipelines work well with dialect-rich, low-resource Indian languages.
Preventing fraud via recorded calls or deepfake voices.
Bot must learn from failed interactions
Required Qualifications And Experience
Job Purpose
We are seeking a dynamic AI/ML Engineer to join our pioneering voice Gen AI R&D team. The ideal candidate will possess a strong foundation in machine learning and a passion for innovation. This role involves developing advanced voice AI solutions.
Duties And Responsibilities
Research and Innovation: Stay abreast of the latest advancements in Gen AI/ML technologies, contributing to research initiatives and applying innovative solutions to practical problems.
Generative AI & Model Optimization
- Fine-tune LLMs/SLMs with proprietary NBFC data.
- Perform distillation, quantization of LLMs for edge deployment.
- Evaluate and run LLM/SLM models on local/edge server machines.
- Develop and fine-tune BOTs capable of negotiation using contextual understanding, emotion detection, and dynamic loan pitch logic.
- Build intelligent Dialogue Management frameworks that adapt in real-time.
- Evaluate Speech-to-Speech (S2S) models for natural voice responses.
- Assess STT models for indic dialects & accuracy; explore emotion-aware TTS engines.
- Experiment with speaker diarization for multi-speaker environments.
- Collect and analyze voice samples for biometric model training.
- Evaluate biometric algorithms for fraud prevention and authentication.
- Implement anti-spoofing techniques to prevent deepfakes/recorded attacks.
- Ensure data privacy compliance in voice data usage.
- Build self-learning systems that adapt without full retraining (e.g., learn new rejection patterns from calls).
- Implement lightweight local models to enable real-time learning on the edge.
Model Selection & Customization
- Choosing the right STT, TTS, and S2S models for various Indic languages and dialects.
- Deciding between open-source vs. commercial APIs based on latency, cost, and control.
- LLM/SLM Strategy
- Selecting appropriate LLM/SLM architectures for dialogue management and negotiation logic.
- Deciding what to fine-tune, distill, or quantize, and what to leave generic.
- Edge vs. Cloud Architecture
- Making trade-offs between on-device processing and cloud-based orchestration.
- Defining what runs locally for speed/privacy and what needs backend support.
- Emotion & Dialogue Logic Integration
- Mapping emotional cues to appropriate TTS responses and negotiation tone.
- Designing fallback logic for unrecognized or hostile user responses.
- Voice Biometrics Algorithm Evaluation
- Choosing and testing biometric algorithms for authentication and anti-spoofing.
- Deciding thresholds for matching, rejection, and fraud escalation
Building a bot that doesn't just answer but negotiates with human-like reasoning.
Running large models (LLM/STT/TTS) in low-latency, low-bandwidth environments without cloud dependency.
Understanding caller emotions in noisy, multilingual conditions (anger, hesitation, sarcasm).
Ensuring STT and TTS pipelines work well with dialect-rich, low-resource Indian languages.
Preventing fraud via recorded calls or deepfake voices.
Bot must learn from failed interactions
Required Qualifications And Experience
- Educational Background: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- Experience: 2–8 years of experience in AI/ML, with exposure to Natural Language Processing (NLP) and speech technologies.
- Strong experience in Speech AI – STT, TTS, S2S, speaker diarization, or related areas.
- Proficiency in LLMs/SLMs, Hugging Face, LangChain, or OpenAI stack.
- Experience with model optimization techniques (quantization, distillation).
- Knowledge of edge AI deployment, low-latency serving.
- Understanding of emotion modeling, biometric systems, and anti-spoofing.
- Experience in Python, PyTorch/TensorFlow, and scalable deployment workflows.
- Bonus: Experience in Indian language dialects, voice data collection, or field deployments in semi-urban/rural settings.
- LLM Finetuning, Speech AI – STT, TTS, S2S, speaker diarization
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in