Overview
Long Description
Role : Data Scientist
Location: Bangalore
Experince: 4 to 6
Choosing Capgemini means choosing a place where you’ll be empowered to shape your career, supported by a collaborative global community, and inspired to reimagine what’s possible. Join us in helping leading Consumer Products and Retail Services (CPRS) organizations unlock the value of technology and drive scalable, sustainable growth.
Role Overview
We are seeking an experienced Data Scientist (GenAI) to design, build, and productionize LLM-powered applications (chat agents, copilots, recommendation engines, anomaly/forecasting assistants) for industrial and operational environments. You will lead end-to-end GenAI solutions—from data collection and integration (including AVEVA PI System and control systems) to model training/fine-tuning, RAG pipelines, vector databases, and API deployment with high availability, observability, and CI/CD. The role blends data science, ML engineering, software development, and systems reliability in regulated, enterprise settings.
Key Responsibilities
- GenAI & LLM Solutions
- Design and implement LLM applications using RAG with LangChain or LlamaIndex; manage prompts, context windows, and retrieval strategies.
- Build pipelines for embedding generation, vector indexing, and semantic search (e.g., Pinecone, FAISS, Azure Cognitive Search).
- Perform model training/fine-tuning (instruction tuning, LoRA/QLoRA, prompt optimization) using PyTorch/TensorFlow.
- ML/Data Science
- Implement anomaly detection, fault tolerance insights, forecasting (time-series), recommender systems, and data-driven decision-making.
- Execute end-to-end data mining, analysis, and visualization; define data quality and conformity control metrics.
- Build dynamic models that adapt to evolving process signals and operational contexts.
- Data Engineering & Integration
- Ingest and harmonize data from AVEVA PI System, control systems, REST APIs, databases, and cloud sources.
- Design scalable data pipelines (batch/streaming), feature stores, and metadata/lineage tracking.
- Ensure regulatory compliance, data governance, and privacy/security by design.
- Cloud & Platform
- Build on Microsoft Azure and/or Google Cloud Platform: storage, compute, managed ML, secrets, monitoring.
- Select and operate vector databases and LLM inference endpoints (managed or self-hosted) with cost/performance guardrails.
Flexible work options and a culture that values collaboration and continuous learning.Opportunities to work on diverse projects and technologies.Access to training and certifications in software development and emerging technologies.Inclusive and supportive teams focused on innovation and excellence.
About Us
Capgemini is a global leader in consulting, technology services, and digital transformation. We help organizations accelerate their business transformation through innovative solutions. With a strong heritage and deep industry expertise, Capgemini is trusted by clients worldwide to deliver end-to-end services across strategy, design, and operations.
Capgemini is a global business and technology transformation partner, helping organizations accelerate their dual transformation to address the evolving needs of customers and citizens. With a strong 55-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs—from strategy and design to operations.