Free cookie consent management tool by TermsFeed Data Scientist | Antal Tech Jobs
Back to Jobs
6 Days ago

Data Scientist

decor
Information Technology
Full-Time
Akaike Technologies

Overview

Data Scientist – Multimodal & Foundation Models (2-4 years)

Role Overview

This role sits at the critical intersection of Applied Research and Machine Learning Engineering. We are looking for a Data Scientist with 2+ years of experience who possesses a first-principles understanding of attention-based models or a proven track record of working with complex SoTA (State-of-the-Art) architectures.

You won't just be integrating APIs; you will be responsible for understanding, training, modifying, and optimizing Transformer-based architectures, Diffusion models, and emerging paradigms. You will build sophisticated multimodal pipelines (Image, Video, Audio) and ensure these models—ranging from standard LLMs to Large Multimodal Models (LMMs)—are fine-tuned, evaluated, and deployed into production.

Core Responsibilities

  • Model Research & Modification: Analyze and improve Transformer architectures. Work deep inside training pipelines for SoTA models, implementing custom loss functions, and experimenting with advanced architectural variants (e.g., Mixture of Experts (MoE), State Space Models (SSM)).
  • Multimodal Pipeline Development: Apply LLMs and Foundation models to script understanding and scene breakdown. Construct complex prompts for generative outputs across image, video, and audio modalities.
  • Fine-Tuning & Optimization: Execute domain-specific fine-tuning (LoRA, QLoRA, PEFT) and implement efficiency techniques like mixed precision, quantization, and pruning to make SoTA models production-viable.
  • Evaluation & Benchmarking: Design structured testing frameworks to benchmark model quality, creative intent, and failure modes. Document findings in technical logs and research notes.
  • Production Engineering: Transition research from Jupyter notebooks to production-ready code. Develop and expose model capabilities via REST APIs and collaborate with engineering to integrate solutions into media pipelines.

Eligibility Requirements (Mandatory)

[!IMPORTANT]

  • Advanced Model Experience: Applicants must have hands-on experience training, modifying, or scaling complex SoTA models (e.g., Llama 3, SDXL, Sora-like architectures, or Whisper). Candidates whose experience is limited to using hosted APIs (OpenAI/Anthropic) or prompt engineering without working at the architecture/training level will not be considered.
  • Experience: 2+ years of hands-on experience in Data Science/ML.
  • Architecture Depth: Deep theoretical and implementation-level understanding of Transformers (Encoder-Decoder/Decoder-only), attention mechanisms, and scaling behavior.
  • Training Expertise: Proven ability to fine-tune models from checkpoints or from scratch. Experience managing training stability and convergence for high-parameter models.
  • Research Literacy: Ability to read, summarize, and implement techniques directly from recent ML research papers (e.g., Diffusion, GenAI, FlashAttention, MoE).

Technical Proficiency

Category

Tools & Concepts

Frameworks

  • Python, PyTorch (preferred), HuggingFace (Transformers, Diffusers, PEFT)

Model Expertise

  • LLMs, LMMs (Large Multimodal Models), Diffusion, MoE, TTS/Voice Cloning

Techniques

  • LoRA/QLoRA, Instruction Tuning, Custom Training Loops, Mixed Precision

Engineering

  • GPU Performance Debugging (e.g., CUDA OOM troubleshooting), REST APIs, Inference Optimization

Good to Have (Bonus)

  • Experience with multimodal generation (Vision/Audio Transformers, Image/Video generation).
  • Familiarity with efficient attention implementations (e.g., FlashAttention-2) or orchestration libraries like LangChain.
  • Contributions to open-source machine learning projects or independent research.
  • An interest in creative AI and entertainment technology.

The Ideal Candidate

  • Thinks like a researcher, builds like an engineer: You stay updated on the latest ArXiv papers and are comfortable experimenting to find the fix for training instability.
  • Deep Ownership: You prefer deep understanding over black-box usage and are comfortable diagnosing model failures at the tensor level.
  • Iterative Mindset: You enjoy the cycle of Research → Prototyping → Production Integration.
Share job
Similar Jobs
View All
16 Hours ago
Associate Devops Lead - GCP
Information Technology
  • 2400000 - 3500000 INR - Annual
  • 6 - 10 Yrs
  • Greater Noida, Noida
Responsibilities Design and deploy complex, multi-tier applications on GCP, ensuring scalability, reliability, and cost-efficiency. Manage and optimize workloads using GCP services like Compute Engine, Kubernetes Engine, BigQuery, Cloud Funct...
decor
16 Hours ago
Director/ Senior Director - Data Delivery Partner (CPG Domain)
Information Technology
  • 6000000 - 8000000 INR - Annual
  • 16 - 23 Yrs
  • Hyderabad
Role Overview: We are seeking an experienced Account Delivery Head – Director level to lead end-to end delivery for strategic accounts in the Consumer Packaged Goods (CPG) domain, with a strong focus on Data Engineering, Advanced Analytics, and Da...
decor
23 Hours ago
Quality Engineering Architect
Information Technology
  • 9 - 12 Yrs
  • Ahmedabad, Indore, Hyderabad
Your mission, roles and requirements: Design and implement scalable automation frameworks while defining the overall testing tool landscape for the organization. The role focuses on building robust test harnesses, significantly reducing testing cy...
decor
1 Day ago
Senior Maps Data Engineer
AI & Machine Learning Advancement
  • 6 - 10 Yrs
  • Hyderabad
Job Opening: Maps Data Engineer Location: Hyderabad Experience: 6+ years About Antal: Antal International, East Patel Nagar Delhi, is a leading recruitment consultancy having expertise in connecting top talent across IT, Manufact...
decor
1 Day ago
Maps Data Engineer
AI & Machine Learning Advancement
  • 4 - 7 Yrs
  • Hyderabad
Job Opening: Maps Data Engineer Location: Hyderabad Experience: 4+ years About Antal: Antal International, East Patel Nagar Delhi, is a leading recruitment consultancy having expertise in connecting top talent across IT, Manufact...
decor
2 Days ago
ETL Developer/Data Engineer
Information Technology
  • Bangalore, Karnataka, India
DescriptionAbout the Organization :G N Solutions Pvt. Ltd. is a trusted IT company providing state-of- the-art solutions, services and products to our clients spread across diverse domains and geographies. We are one of the privileged IBM Business Pa...
decor
2 Days ago
Vision Group - Senior Software Engineer
Information Technology
  • Bangalore, Karnataka, India
DescriptionRequired Mandatory Skills : Architecture Design Dot Net .Net Core JavaScript SQL Server Azure Cloud MicroservicesJob Responsibilities Responsible for delivering high quality software on time Works closely with Engineering leads and other d...
decor
2 Days ago
Cloud Native Architect - Azure
Information Technology
  • Bangalore, Karnataka, India
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will c...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media