Bangalore, Karnataka, India
Information Technology
Full-Time
Infosys
Overview
We’re looking for a hands-on Python engineer with deep experience in building GenAI applications at production scale. You’ll design and implement LLM-powered systems—prompt pipelines, RAG, agents, evaluation frameworks—and ship reliable, secure, and cost-efficient AI features used by customers/stakeholders.
Key Responsibilities
- Design & Build GenAI Services: Implement prompt orchestration, tool-using agents, and RAG pipelines (indexing, chunking, embeddings, retrieval, ranking, grounding, guardrails).
- Model Integration: Integrate OpenAI/Azure OpenAI/Claude/Llama (hosted or self-hosted) via SDKs; evaluate models for latency, accuracy, and cost; fine-tune or instruct-tune where appropriate.
- Data & Features: Build ETL/data pipelines to prepare un/structured data; manage vector databases; implement prompt templates and prompt versioning.
- Evaluation & Observability: Define offline metrics (precision@k, hallucination rate, factuality) and online metrics (success rate, latency, CX/CSAT); set up tracing and telemetry (e.g., LangSmith, Weights & Biases, OpenTelemetry).
- Productionization: Write robust, tested APIs/microservices; containerize; CI/CD; ensure security, compliance, and privacy (PII handling, content filters, RBAC).
- Collaboration: Work closely with Product, Data, Security, and MLOps to move from POC → MVP → production.
- Cost/Latency Optimization: Token optimization, model selection, caching, batching, prompt compression, spec exec.
Must-Have Qualifications
- 5–10 years of professional software development with Python (fastAPI/Flask, asyncio, typing, packaging).
- 2+ years hands-on with LLMs/GenAI: prompt engineering, RAG, embeddings, function calling/tools, guardrails.
- Experience with one or more LLM frameworks: LangChain, LlamaIndex, Haystack, or custom orchestration.
- Vector DBs: Pinecone, FAISS, Chroma, Weaviate, pgvector, or Milvus.
- Model Providers: OpenAI/Azure OpenAI, Anthropic Claude, Google Gemini, Meta Llama, Mistral, or Cohere.
- Cloud & DevOps: Deploy on Azure/AWS/GCP, Docker, CI/CD (GitHub Actions/GitLab), logging/monitoring.
- Strong grounding in data structures/algorithms, HTTP APIs, authN/Z, and unit/integration testing.
- Practical knowledge of prompt testing & evaluation (A/B tests, golden sets, LLM-as-judge with safeguards).
Good-to-Have / Preferred
- Fine-tuning or LoRA/QLoRA, PEFT, or domain adaptation for LLMs.
- Multimodal (image/doc QA, speech) or agents (planning, tool-use, code exec).
- Information retrieval (BM25, hybrid retrieval, re-ranking with bge, E5, ColBERT).
- MLOps: Feature stores, model registries, Ray/Databricks; Airflow/Prefect for orchestration.
- Security/Compliance: SOC2, ISO 27001, GDPR, content moderation, PII redaction.
- Domain experience: Enterprise search, Knowledge assistants, Customer support automation, Document intelligence, Coding copilots, Marketing content generation.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in