Overview
Calix provides the cloud, software platforms, systems and services required for communications service providers to simplify their businesses, excite their subscribers and grow their value.
Job Description
Calix is seeking a highly skilled Sr. DevOps Engineer to join our cutting-edge AI/ML team. In this role, you will be the architect of our internal delivery ecosystem. You will bridge the gap between physical hardware and application performance by building a robust, on-premises "Private Cloud."
Your mission is to ensure that our development teams can deploy code at high velocity while maintaining world-class application health, stability, and observability—all within our managed data center environments.
Key Responsibilities:
- Automated Infrastructure: Design and manage on-premises virtualization (VMware, KVM) and bare-metal environments using Terraform and Ansible to achieve "push-button" infrastructure.
- On-Prem Kubernetes: Architect and maintain self-managed Kubernetes clusters (e.g., Rancher, OpenShift, or Upstream), ensuring high availability for containerized workloads.
- Continuous Delivery (CD) Excellence: Build and optimize automated deployment pipelines (GitLab CI, Jenkins, or ArgoCD) to ensure seamless, low-risk application releases.
- Application Health & Observability: Own the end-to-end monitoring strategy. Implement SLOs/SLIs using Prometheus, Grafana, and ELK/OpenSearch to proactively detect and resolve application performance bottlenecks.
- Lifecycle Management: Manage the full lifecycle of internal platform services, including local object storage (MinIO), private registries, and local load balancers.
- Hybrid/On-Prem Strategy: Lead the effort in building a hybrid-ready architecture that treats onprem resources with the same agility and automation as public cloud services.
- Incident Response & Post-Mortems: Act as a tier-3 escalation point for complex infrastructure and application health issues, leading root cause analysis (RCA) to prevent recurrence.
- Mentorship: Train and guide junior engineers in GitOps practices, CI/CD optimization, and modern systems administration.
- Experience: 7+ years in DevOps or Systems Engineering, with at least 4 years focused on highscale on-premises environments.
- CI/CD Mastery: Deep expertise in building complex CD pipelines and GitOps workflows for onprem Kubernetes.
- Experience with ArgoCD for Kubernetes-native continuous delivery.
- Infrastructure as Code: Expert-level proficiency with Ansible (for OS/Configuration) and Terraform (for Virtualization/Hardware).
- System Internals: Strong Linux administration skills (Ubuntu/RHEL), including kernel tuning, networking (L2/L3), and storage (SAN/NAS/Storage-ready fabrics).
- Observability Stack: Proven experience implementing Prometheus, Grafana, and APM tools (like Jaeger or Dynatrace) to monitor application-level health.
- Scripting: Advanced proficiency in Python, Go, or Bash for building custom automation tooling.
- Soft Skills: Ability to articulate application health metrics and infrastructure ROI to both technical and non-technical stakeholders.
Preferred Qualifications:
- Knowledge of Bare Metal provisioning (MaaS, Ironic, or Razor).
- Certified Kubernetes Administrator (CKA) or equivalent high-level Linux/Networking certifications.
Location:
India – (Flexible hybrid work model - work from Bangalore office for 20 days in a quarter