Bangalore, Karnataka, India
Information Technology
Full-Time
Global Payments Inc.
Overview
Description
RESPONSIBILITIES
Must Haves:
RESPONSIBILITIES
- Design and implement CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment (including LLMs, vectorDB, embedding and reranking models, governance and observability systems, and guardrails).
- Provision and manage AI infrastructure across cloud hyperscalers (AWS/GCP), using infrastructure-as-code tools -strong preference for Terraform-.
- Maintain containerized environments (Docker, Kubernetes) optimized for GPU workloads and distributed compute.
- Support vector database, feature store, and embedding store deployments (e.g., pgVector, Pinecone, Redis, Featureform. MongoDB Atlas, etc).
- Monitor and optimize performance, availability, and cost of AI workloads, using observability tools (e.g., Prometheus, Grafana, Datadog, or managed cloud offerings).
- Collaborate with data scientists, AI/ML engineers, and other members of the platform team to ensure smooth transitions from experimentation to production.
- Implement security best practices including secrets management, model access control, data encryption, and audit logging for AI pipelines.
- Help support the deployment and orchestration of agentic AI systems (LangChain, LangGraph, CrewAI, Copilot Studio, AgentSpace, etc.).
- 4+ years of DevOps or infrastructure engineering experience. Preferably with 2+ years in AI/ML environments.
- Hands-on experience with cloud-native services (AWS Bedrock/SageMaker, GCP Vertex AI, or Azure ML) and GPU infrastructure management.
- Strong skills in CI/CD tools (GitHub Actions, ArgoCD, Jenkins) and configuration management (Ansible, Helm, etc.).
- Proficient in scripting languages like Python, Bash, -Go or similar is a nice plus-.
- Experience with monitoring, logging, and alerting systems for AI/ML workloads.
- Deep understanding of Kubernetes and container lifecycle management.
- Exposure to MLOps tooling such as MLflow, Kubeflow, SageMaker Pipelines, or Vertex Pipelines.
- Familiarity with prompt engineering, model fine-tuning, and inference serving.
- Experience with secure AI deployment and compliance frameworks
- Knowledge of model versioning, drift detection, and scalable rollback strategies.
- Ability to work with a high level of initiative, accuracy, and attention to detail.
- Ability to prioritize multiple assignments effectively. Ability to meet established deadlines.
- Ability to successfully, efficiently, and professionally interact with staff and customers.
- Excellent organization skills.
- Critical thinking ability ranging from moderately to highly complex.
- Flexibility in meeting the business needs of the customer and the company.
- Ability to work creatively and independently with latitude and minimal supervision.
- Ability to utilize experience and judgment in accomplishing assigned goals.
- Experience in navigating organizational structure.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in