Overview
About the Role
We are building India’s first agentic browser an Electron + Chromium desktop application that combines on-device large-language models, LangChain-style orchestration, and privacy-preserving sandboxing to automate everyday web tasks. You will be the foundational AI engineer on a six-person product squad, owning the prototyping, fine-tuning, and production deployment of language-model agents that power features such as Smart-Tab Insights, IRCTC form-filling, and Retrieval-Augmented Web Q&A.
This position is ideal for early-career engineers (1-3 years) who have shipped ML projects in Python and want end-to-end ownership from data pipelines to inference optimisation in a fast-moving startup.
Key Responsibilities
Model Prototyping & Fine-Tuning
Select open-source LLM checkpoints (e.g., Llama-3 8B) and fine-tune with LoRA on domain-specific datasets for intent detection, summarisation, and form-filling tasks .
Implement RAG pipelines with vector stores (Pinecone/FAISS) for website Q&A and Deep-Web crawler outputs .
Agent Orchestration
Design LangChain/CrewAI chains for multi-step workflows (e.g., “Book train → pay via UPI → save ticket to DigiLocker”) .
Develop memory and context-persistence modules so agents recall user preferences across sessions .
Inference & MLOps
Package quantised models (INT4/GPTQ) inside the Electron runtime using web-friendly back-ends such as vLLM or TGI .
Automate continuous evaluation, prompt regression tests, and GPU/CPU performance dashboards (MLflow, Prometheus) .
Data & Compliance
Build secure data pipelines that respect India’s DPDP Act, including on-device red-teaming and PII scrubbing .
Collaborate with Infra engineer to containerise sandboxed inference services (Docker/Kubernetes) .
Cross-Functional Collaboration
Pair with Fullstack engineers to expose agent capabilities via a drag-and-drop command bar and visual AI assistant .
Write clear RFCs and technical docs; demo prototypes in bi-weekly sprint reviews .
Requirements
Minimum Qualifications
1-3 years professional experience in machine-learning engineering or applied NLP .
Strong Python skills with DL frameworks (PyTorch / TensorFlow).
Hands-on exposure to at least one LLM toolkit (LangChain, Llama-Index, Hugging Face Transformers) .
Familiarity with vector databases and RAG patterns.
Ability to write clean, production-grade code, covered by unit tests and CI pipelines.
Preferred / “Good-to-Have” Skills
Experience running quantised models on-device (GGUF, WebGPU) or on GPU clouds (AWS/Azure A100) .
Knowledge of browser automation (Playwright / Puppeteer) and DOM parsing.
Understanding of privacy engineering: sandboxing, data-tokenisation, or differential privacy.
Basic React or TypeScript familiarity to debug front-end agent calls.
Participation in open-source GenAI projects or published notebooks.
If you’re passionate about pushing the boundaries of browser-based AI and want to see your code in the hands of millions, apply now.