Sahibzada ajit singh nagar, Punjab, India
Information Technology
Full-Time
PalTech
Overview
Job Description
Key Responsibilities:
Leadership & Team Guidance
- Mentor and lead data scientists and ML engineers across multiple AI/ML and GenAI projects.
- Provide direction on solution design, model development, data pipelines, and deployment strategies.
- Set and enforce best practices in experimentation, reproducibility, code quality, and delivery excellence.
AI/ML & Data Science Expertise
- Design and implement end-to-end ML workflows across classical and modern data science:
- Regression, classification, clustering, time-series forecasting
- Anomaly detection, recommender systems
- Feature engineering, statistics, hypothesis testing, causal modelling
- Conduct research and build models across NLP, Computer Vision, and Reinforcement Learning.
- Use deep learning frameworks like PyTorch, TensorFlow, Hugging Face Transformers for model development.
Generative AI & Agentic AI Systems
- Build, fine-tune, and deploy Large Language Models (LLMs) using GPT, Llama, Mistral, Qwen, and similar.
- Implement and optimize Retrieval-Augmented Generation (RAG) pipelines.
- Build multi-agent workflows using frameworks like LangChain, LangGraph, CrewAI, or LlamaIndex.
- Rapidly prototype with Hugging Face Transformers and similar model libraries.
Production-Grade AI Delivery
- Architect and deploy large-scale, production-grade ML and GenAI systems with monitoring and reliability.
- Build and deploy ML APIs using FastAPI or Flask.
- Implement CI/CD pipelines, model monitoring, versioning, and ML/AgentOps (MLflow, LangFuse, evaluation frameworks).
- Optimize models for performance, latency, and cost efficiency across cloud and on-prem environments.
On-Prem GPU Infrastructure & Local LLM Hosting
- Manage and maintain on-premise GPU servers, including:
- CUDA & cuDNN installation and troubleshooting
- GPU driver setup & optimization
- NVIDIA toolkit, Docker GPU runtime configuration
- Set up, maintain, and administer JupyterHub/JupyterLab for team-wide model development.
- Host and run LLMs locally using:
- Ollama, llama.cpp, vLLM, text-generation-inference, or similar runtimes
- Optimize GPU memory usage, model quantization (gguf, int4/int8), and inference performance.
- Manage local deployment environments, containers, and dependencies for on-prem workflows.
Client Collaboration & Pre-Sales Engagement
- Engage in pre-sales discussions to understand client requirements and pain points.
- Recommend scalable, cost-efficient AI/ML/GenAI architectures (cloud or on-prem).
- Prepare and deliver technical presentations, solution proposals, and architecture diagrams.
- Translate complex AI concepts into clear narratives for business and technical stakeholders.
Innovation, Governance & Best Practices
- Stay up to date with advancements in AI, GenAI, ML engineering, and GPU acceleration.
- Build internal tools, playbooks, and frameworks to improve delivery and engineering practices.
- Drive compliance with enterprise standards around security, data governance, and reliability.
Required Qualifications
- 6+ years of hands-on experience in data science, machine learning, and production deployment.
- Strong team leadership, mentoring, and technical direction experience.
- Expertise in Python and ML/DL frameworks: PyTorch, TensorFlow, Hugging Face, Scikit-learn.
- Proven experience working with LLMs, transformers, RAG, and GenAI applications.
- Strong background in core data science (statistics, modelling, feature engineering, ML algorithms).
- Hands-on experience deploying production AI systems on cloud or on-prem environments.
- Experience maintaining on-prem GPU servers, CUDA installation, and driver management.
- Familiarity with JupyterHub/JupyterLab setup, user management, and environment provisioning.
- Experience hosting LLMs locally using Ollama, llama.cpp, vLLM, or similar.
- Strong communication skills, especially in pre-sales, technical architecture discussions, and client engagements.
Preferred Qualifications
- Experience with agentic AI frameworks such as LangChain, LangGraph, CrewAI, or LlamaIndex.
- Familiarity with MLflow, LangFuse, Prometheus, or other monitoring frameworks.
- Understanding of MLOps/DevOps practices, Docker/Kubernetes, and CI/CD pipelines.
- Contributions to open-source AI/ML or GenAI tooling.
- Certifications in cloud, AI/ML, or data engineering.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in