Free cookie consent management tool by TermsFeed Senior Devops Engineer- ML Engineering Support | Antal Tech Jobs
Back to Jobs
3 Weeks ago

Senior Devops Engineer- ML Engineering Support

decor
Bangalore, Karnataka, India
Information Technology
Full-Time
Roku

Overview

Teamwork makes the stream work. Roku is changing how the world watches TV

Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers.

From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines.

About the Role

We are seeking a talented and experienced Senior Software Engineer, DevOps/SRE to join our dynamic team and play a critical role in supporting Machine Learning Engineering activities. The ideal candidate will have a strong background in DevOps practices, cloud infrastructure management, automation, and MLOps tooling, along with team leadership skills.

If you have a proven track record architecting and scaling ML/AI platforms, enjoy solving intriguing system challenges at internet-scale, are innovative at heart, and thrive in building infrastructure that accelerates ML experimentation and deployment — this role might be a great fit for you!

What You’ll Be Doing
  • Provide technical leadership and guidance to DevOps/SRE engineers supporting ML Engineering initiatives; mentor team members in best practices, technologies, and methodologies.
  • Design, implement, and maintain scalable and resilient cloud infrastructure (AWS & GCP) optimized for ML workloads, including GPU/TPU orchestration and distributed training.
  • Partner with ML Engineers to streamline the end-to-end ML lifecycle: data ingestion, feature engineering, training, evaluation, deployment, and monitoring.
  • Build and maintain CI/CD pipelines for ML applications and models using GitHub Actions, GitLab CI/CD, Argo, or Tekton.
  • Integrate with MLOps platforms (e.g., MLflow, Kubeflow, Airflow, SageMaker, Vertex AI) to ensure reproducibility and traceability of experiments.
  • Lead incident response efforts for ML-serving and training infrastructure, minimizing downtime and ensuring high availability.
  • Implement observability practices for ML workloads, including model performance monitoring, drift detection, and metrics via Prometheus, Grafana, and Datadog.
  • Collaborate with security and compliance teams to ensure adherence to data governance, PCI, SOX, and AI/ML data security standards.
  • Optimize system resources for large-scale ML jobs, including auto-scaling GPU clusters, cost optimization, and quota management.
  • Drive continuous improvement across DevOps + MLOps processes; proactively identify areas for enhancement.
  • Maintain clear documentation and foster a culture of knowledge sharing across DevOps, ML, and Data Engineering teams.
  • Participate in 24x7 on-call rotation, with availability to work with global teams in the event of critical outages.
We’re Excited if You Have
  • 8+ years of experience in DevOps/SRE roles, including at least 2–3 years supporting ML or data-intensive workloads.
  • Strong programming skills in Python or Go; experience building internal tools and automation for ML pipelines.
  • Hands-on experience with Kubernetes, Docker, ECS/EKS/GKE, and service mesh tools such as Istio or Envoy.
  • Familiarity with GPU/accelerator orchestration (NVIDIA GPU Operator, KubeFlow, Slurm, Ray, or similar).
  • Experience with Infrastructure as Code (IaC): Terraform, Helm, Ansible, or CloudFormation.
  • Deep understanding of distributed systems, microservices architecture, and cloud-native design patterns.
  • Exposure to MLOps tools: MLflow, Kubeflow Pipelines, Airflow, Argo, Vertex AI, or SageMaker.
  • Strong proficiency in cloud platforms (AWS and GCP required; Azure a plus).
  • Knowledge of data engineering concepts (object storage like S3/GCS, parquet/ORC, data versioning with DVC or Delta Lake).
  • Experience with networking, security, and compliance (role-based access, VPC design, encryption, auditing).
  • Demonstrated success in cross-functional collaboration with ML, Data, and Product teams.
  • Preferred certifications: Certified Kubernetes Administrator (CKA), AWS Certified DevOps Engineer, Google Professional Cloud DevOps Engineer, NVIDIA Deep Learning Institute courses.
  • AI literacy and curiosity, You have either tried Gen AI in your previous work or outside of work or are curious about Gen AI and have explored it. 
  • BS Degree in Computer Science or equivalent experience.
Benefits

Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension). Our employees can take time off work for vacation and other personal reasons to balance their evolving work and life needs. It's important to note that not every benefit is available in all locations or for every role. For details specific to your location, please consult with your recruiter.

The Roku Culture

Roku is a great place for people who want to work in a fast-paced environment where everyone is focused on the company's success rather than their own. We try to surround ourselves with people who are great at their jobs, who are easy to work with, and who keep their egos in check. We appreciate a sense of humor. We believe a fewer number of very talented folks can do more for less cost than a larger number of less talented teams. We're independent thinkers with big ideas who act boldly, move fast and accomplish extraordinary things through collaboration and trust. In short, at Roku you'll be part of a company that's changing how the world watches TV. 

We have a unique culture that we are proud of. We think of ourselves primarily as problem-solvers, which itself is a two-part idea. We come up with the solution, but the solution isn't real until it is built and delivered to the customer. That penchant for action gives us a pragmatic approach to innovation, one that has served us well since 2002. 

To learn more about Roku, our global footprint, and how we've grown, visit https://www.weareroku.com/factsheet.

By providing your information, you acknowledge that you want Roku to contact you about job roles, that you have read Roku's Applicant Privacy Notice, and understand that Roku will use your information as described in that notice. If you do not wish to receive any communications from Roku regarding this role or similar roles in the future, you may unsubscribe here at any time.

Share job
Similar Jobs
View All
1 Day ago
Tradelab Technologies - .Net Developer - Trading/Fintech Domain
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
DescriptionKey Responsibilities : Design, develop, and maintain high-quality web applications using .NET Core and related technologies. Collaborate with product managers, architects, and other developers to define technical requirements and delive...
decor
1 Day ago
Astrotalk - iOS Developer - SWIFT Programming
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
Job DescriptionAre you passionate about crafting seamless mobile experiences? Join us as an iOS Developer and be a part of our mission to build innovative, user-friendly applications that stand out in the App Store.What You'll Do Develop and mainta...
decor
1 Day ago
Android App Developer in Jaipur
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
Key Responsibilities Mobile app development (frontend): Design, develop, and maintain Android mobile applications using Kotlin and Java. Collaborate with UI/UX designers to implement responsive, efficient, and visually appealing user interfaces. ...
decor
1 Day ago
Pinnacle Teleservices - JavaScript Developer - Node.js/React.js
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
DescriptionAbout the jobWe are hiring versatile JavaScript Developers skilled in Node.js, React, or both.ResponsibilitiesYou will contribute to full-stack feature development, write clean and maintainable code, and collaborate across teams to delive...
decor
1 Day ago
QA Engineer
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
Looking for 5 years experienceResponsibilities Design Development and TestingRequired SkillsRelevant experience in Apps Development or systems analysis role Experience with REST SOAP Microservices Core Java  Experience with Spring Boot Swagger Tomca...
decor
1 Day ago
Senior Software Engineer - Automation Testing
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data a...
decor
1 Day ago
Artificial Intelligence/Machine Learning Engineer
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
DescriptionAbout The Role :We are seeking an experienced and passionate AI/ML Engineer with 58 years of expertise in Predictive ML, Generative AI, and Agentic AI.This role involves designing, developing, and deploying end-to-end AI solutions that ad...
decor
1 Day ago
Tradelab Technologies - Mobile Application Developer - Fintech & Trading Domain
Space Exploration & Research, Information Technology
  • Bangalore, Karnataka, India
DescriptionLocation : Bengaluru, India, Exp : 2+ YrsAbout UsAt Tradelab Technologies Pvt Ltd, we dont just build web applications, we create high-performance trading platforms tailored for brokers and financial institutions. Our platforms power rea...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media