700000 - 2000000 INR - Yearly
Noida, Uttar Pradesh, India
Information Technology
Full-Time
Lalitech India Pvt Ltd
Overview
Location: Hybrid/ Remote
Type: Contract / Full‑Time
Experience: 5+ Years
Qualification: Bachelor’s or Master’s in Computer Science or a related technical field
Responsibilities
- Architect & implement the RAG pipeline: embeddings ingestion, vector search (MongoDB Atlas or similar), and context-aware chat generation.
- Design and build Python‑based services (FastAPI) for generating and updating embeddings.
- Host and apply LoRA/QLoRA adapters for per‑user fine‑tuning.
- Automate data pipelines to ingest daily user logs, chunk text, and upsert embeddings into the vector store.
- Develop Node.js/Express APIs that orchestrate embedding, retrieval, and LLM inference for real‑time chat.
- Manage vector index lifecycle and similarity metrics (cosine/dot‑product).
- Deploy and optimize on AWS (Lambda, EC2, SageMaker), containerization (Docker), and monitoring for latency, costs, and error rates.
- Collaborate with frontend engineers to define API contracts and demo endpoints.
- Document architecture diagrams, API specifications, and runbooks for future team onboarding.
Required Skills
- Strong Python expertise (FastAPI, async programming).
- Proficiency with Node.js and Express for API development.
- Experience with vector databases (MongoDB Atlas Vector Search, Pinecone, Weaviate) and similarity search.
- Familiarity with OpenAI’s APIs (embeddings, chat completions).
- Hands‑on with parameters‑efficient fine‑tuning (LoRA, QLoRA, PEFT/Hugging Face).
- Knowledge of LLM hosting best practices on AWS (EC2, Lambda, SageMaker).
Containerization skills (Docker).
- Good understanding of RAG architectures, prompt design, and memory management.
- Strong Git workflow and collaborative development practices (GitHub, CI/CD).
Nice‑to‑Have
- Experience with Llama family models or other open‑source LLMs.
- Familiarity with MongoDB Atlas free tier and cluster management.
- Background in data engineering for streaming or batch processing.
- Knowledge of monitoring & observability tools (Prometheus, Grafana, CloudWatch).
- Frontend skills in React to prototype demo UIs.
Job Type: Full-time
Pay: ₹700,000.00 - ₹2,000,000.00 per year
Work Location: Remote
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in