Hyderabad, Telangana, India
Information Technology
Full-Time
Spydra
Overview
Job Description
We are looking for an exceptional Data Scientist with deep expertise in speech technologies, advanced NLP, and LLM fine-tuning to join our cutting-edge AI research team. In this pivotal role, you will be responsible for building and optimizing state-of-the-art machine learning pipelines that drive intelligent audio and language-based products.
Your work will directly contribute to the development of next-generation AI solutions that are privacy-focused, high-performance, and built for scale.
Key Responsibilities
We are looking for an exceptional Data Scientist with deep expertise in speech technologies, advanced NLP, and LLM fine-tuning to join our cutting-edge AI research team. In this pivotal role, you will be responsible for building and optimizing state-of-the-art machine learning pipelines that drive intelligent audio and language-based products.
Your work will directly contribute to the development of next-generation AI solutions that are privacy-focused, high-performance, and built for scale.
Key Responsibilities
- Develop and deploy real-time ASR pipelines, leveraging models like Whisper, wav2vec2, or custom speech models.
- Design and implement robust intent detection and entity extraction systems, utilizing transcribed speech, keyword spotting, and semantic pattern recognition.
- Fine-tune LLMs and transformer architectures (BERT, RoBERTa, etc.) for tasks including intent classification, entity recognition, and contextual comprehension.
- Optimize end-to-end pipelines for mobile and on-device inference, employing tools like TFLite, ONNX, quantization, and pruning to achieve low-latency performance.
- Collaborate closely with AI product teams and MLOps engineers to ensure seamless deployment, continuous iteration, and performance monitoring.
- Hands-on experience with ASR models (Whisper, wav2vec2, DeepSpeech, Kaldi, Silero), with a focus on fine-tuning for Indian languages and multilingual scenarios.
- Strong command of NLP techniques such as keyword spotting, sequence labeling, masked token prediction, and rule-based classification.
- Proven track record in LLM and transformer fine-tuning for NER, intent detection, and domain-specific adaptation.
- Expertise in speech metadata extraction, feature engineering, and signal enrichment.
- Proficiency in model optimization methods like quantization-aware training (QAT), pruning, and efficient runtime deployment for edge devices.
- Excellent Python skills with proficiency in PyTorch or TensorFlow, along with solid experience in NumPy, pandas, and real-time data processing frameworks.
- Bachelors or Masters degree in Computer Science, Electrical Engineering, Data Science, or a related technical field.
- Academic or industry background in speech processing, ASR, telecom analytics, or applied NLP is highly desirable.
- Portfolio showcasing real-world speech/NLP projects, open-source contributions, or published research will be a strong advantage.
- 3 to 6+ years of applied experience in speech AI, NLP for intent detection, or machine learning model development.
- Proven success in building, deploying, and optimizing ML models for real-time, low-latency environments.
- Contributions to leading open-source projects like openai/whisper, mozilla/DeepSpeech, or facebook/wav2vec2 are highly valued.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in