Bangalore, Karnataka, India
Information Technology
Full-Time
VAYUZ Technologies
Overview
Responsibilities- Own day-2 production operations of a large-scale, AI-first platform on GCP.
- Run, scale, and harden GKE-based workloads integrated with a broad set of GCP managed services (data, messaging, AI, networking, and security).
- Define, implement, and operate SLIs, SLOs, and error budgets across platform and AI services.
- Build and own New Relic observability end-to-end (APM, infrastructure, logs, alerts, dashboards).
- Improve and maintain CI/CD pipelines and Terraform-driven infrastructure automation.
- Operate and integrate Azure AI Foundry for LLM deployments and model lifecycle management.
- Lead incident response, postmortems, and drive systemic reliability improvements.
- Optimize cost, performance, and autoscaling for AI and data-intensive workloads.
- 6+ years of hands-on experience in DevOps, SRE, or Platform Engineering roles.
- Strong, production-grade experience with GCP, especially GKE and core managed services.
- Proven expertise running Kubernetes at scale in live environments.
- Deep hands-on experience with New Relic in complex, distributed systems.
- Experience operating AI/ML or LLM-driven platforms in production environments.
- Solid background in Terraform, CI/CD, cloud networking, and security fundamentals.
- Comfortable owning production systems end-to-end with minimal supervision. Requirement 2 months in mumbai, Then Bangalore 2 months accommodation at mumbai will be provided
(ref:hirist.tech)
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in