
Overview
Hiring for a client ...
As a Senior Data Scientist, you will be responsible for collecting, organizing, analyzing, and interpreting Truecaller data with a focus on NLP. In this role, you will be pivotal in advancing our work with large language models and on-device models across diverse regions. Your expertise will enhance our natural language processing, machine learning, and predictive analytics capabilities.
What you bring in:
● 5+ years of experience in designing, developing, and deploying ML models at scale, with a focus on NLP-driven solutions.
● Strong background in Natural Language Processing (NLP), including text classification, entity recognition, language modeling, and transformer-based architectures. ● Experience in building and deploying models at scale, handling millions of messages efficiently while maintaining performance and accuracy. Also working with on-device models.
● Ability to not only build ML models but also take ownership of deploying them into production, ensuring scalability, reliability, and monitoring.
● Knowledge of anomaly detection, adversarial ML techniques, and risk modeling to identify and prevent spam and fraudulent messaging activities.
● Strong ability to take ML models from research and experimentation to production, working closely with ML engineers and data engineers.
● Expertise in machine learning libraries such as TensorFlow, PyTorch, pandas and Scikit-learn, along with NLP-specific tools like Hugging Face Transformers, spaCy with experience in TFlife, ONNX.
● Hands-on experience fine-tuning LLMs including transformer-based architectures (BERT, GPT, LLaMA, T5, etc.) for domain-specific applications, including knowledge distillation, quantization, and model compression for efficiency.
● Strong ability to design, refine, and optimize prompts for LLM-based applications, ensuring high-quality responses and reduced model hallucinations.
● Ability to leverage data driven decision by experimentation, and statistical analysis to improve models and business outcomes.
● Strong understanding of designing, testing, and optimizing prompts for LLM-based applications to improve model accuracy and efficiency.
● Programming knowledge in at least one language, such as Python or R. Preferably python.
● Expert knowledge of machine learning algorithms.
● Familiarity with database modelling and data warehousing principles with a working knowledge of SQL
● Experience in building and optimizing large-scale data processing systems using Spark/PySpark
● Strong ability to work cross-functionally with engineers, product managers, and business stakeholders to align ML solutions with company objectives.
The impact you will create:
● Take a loosely defined business problem and break it into tractable data problems. For each data problem, clearly articulate the value of solving it, its impact, and its complexity.
● Collaborate with Product and Engineering to scope, design, and implement systems that solve complex business problems ensuring they are delivered on time and within scope. ● Design, develop, and optimize state-of-the-art NLP models for large-scale message classification, fraud detection, and spam filtering, impacting millions of users globally. ● Take full ownership of ML model development, deployment, and monitoring, ensuring models are production-ready, scalable, and cost-efficient.
● Lead data science projects from ideation to deployment, ensuring alignment with business objectives and timelines.
● Manage and analyze large datasets collected from multiple countries, ensuring data integrity and consistency.
● Stay updated on industry best practices and emerging technologies to drive innovation within the Data Team.
● You work collaboratively across systems and teams to solve user and business problems. You are expected to help define success and design and build the systems to achieve it.
● To work with the Product to decide on priorities and set direction, design solutions, and help the team implement them.
It would be great if you also have:
● Understanding of Conversational AI
● Deploying NLP models in production
● Working knowledge of GCP components
Job Type: Full-time
Pay: ₹4,000,000.00 - ₹6,000,000.00 per year
Benefits:
- Cell phone reimbursement
- Food provided
- Health insurance
- Paid time off
- Provident Fund
- Work from home
Work Location: In person