
Overview
About Company
Quantanite is a customer experience (CX)solutions company that helps fast-growing companies and leading global brands to transform and grow. We do this through a collaborative and consultative approach, rethinking business processes and ensuring our clients employ the optimal mix of automation and human intelligence. We are an ambitious team of professionals spread across four continents and looking to disrupt our industry by delivering seamless customer experiences for our clients, backed-up with exceptional results. We have big dreams, and are constantly looking for new colleagues to join us who share our values, passion and appreciation for diversity.
About the Role:
We are seeking a highly skilled Senior AI Engineer with deep expertise in Agentic frameworks, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, MLOps/LLMOps, and end-to-end GenAI application development. In this role, you will design, develop, fine-tune, deploy, and optimize state-of-the-art AI solutions across diverse enterprise use cases including AI Copilots, Summarization, Enterprise Search, and Intelligent Tool Orchestration.
Key Responsibilities:
- Develop and Fine-Tune LLMs (e.g., GPT-4, Claude, LLaMA, Mistral, Gemini) using instruction tuning, prompt engineering, chain-of-thought prompting, and fine-tuning techniques.
- Build RAG Pipelines: Implement Retrieval-Augmented Generation solutions leveraging embeddings, chunking strategies, and vector databases like FAISS, Pinecone, Weaviate, and Qdrant.
- Implement and Orchestrate Agents: Utilize frameworks like MCP, OpenAI Agent SDK, LangChain, LlamaIndex, Haystack, and DSPy to build dynamic multi-agent systems and serverless GenAI applications.
- Deploy Models at Scale: Manage model deployment using HuggingFace, Azure Web Apps, vLLM, and Ollama, including handling local models with GGUF, LoRA/QLoRA, PEFT, and Quantization methods.
- Integrate APIs: Seamlessly integrate with APIs from OpenAI, Anthropic, Cohere, Azure, and other GenAI providers.
- Ensure Security and Compliance: Implement guardrails, perform PII redaction, ensure secure deployments, and monitor model performance using advanced observability tools.
- Optimize and Monitor: Lead LLMOps practices focusing on performance monitoring, cost optimization, and model evaluation.
- Work with AWS Services: Hands-on usage of AWS Bedrock, SageMaker, S3, Lambda, API Gateway, IAM, CloudWatch, and serverless computing to deploy and manage scalable AI solutions.
- Contribute to Use Cases: Develop AI-driven solutions like AI copilots, enterprise search engines, summarizers, and intelligent function-calling systems.
- Cross-functional Collaboration: Work closely with product, data, and DevOps teams to deliver scalable and secure AI products.
Required Skills and Experience:
- Deep knowledge of LLMs and foundational models (GPT-4, Claude, Mistral, LLaMA, Gemini).
- Strong expertise in Prompt Engineering, Chain-of-Thought reasoning, and Fine-Tuning methods.
- Proven experience building RAG pipelines and working with modern vector stores (FAISS, Pinecone, Weaviate, Qdrant).
- Hands-on proficiency in LangChain, LlamaIndex, Haystack, and DSPy frameworks.
- Model deployment skills using HuggingFace, vLLM, Ollama, and handling LoRA/QLoRA, PEFT, GGUF models.
- Practical experience with AWS serverless services: Lambda, S3, API Gateway, IAM, CloudWatch.
- Strong coding ability in Python or similar programming languages.
- Experience with MLOps/LLMOps for monitoring, evaluation, and cost management.
- Familiarity with security standards: guardrails, PII protection, secure API interactions.
- Use Case Delivery Experience: Proven record of delivering AI Copilots, Summarization engines, or Enterprise GenAI applications.
Experience
- 4-6 years of experience in AI/ML roles, focusing on LLM agent development, data
science workflows, and system deployment.
- Demonstrated experience in designing domain-specific AI systems and integrating
structured/unstructured data into AI models.
- Proficiency in designing scalable solutions using LangChain and vector databases.
Job Type: Full-time
Pay: ₹2,000,000.00 - ₹2,500,000.00 per year
Benefits:
- Work from home
Schedule:
- Monday to Friday
Application Question(s):
- What is your Current Location
- What is your Total Work Experience
- What is your Current Salary
- What is your Expected Salary
- What is your Notice Period
- Are you willing to relocate to Mumbai
Work Location: In person