Overview
About this role
Zenskar is building the operational backbone for how B2B companies run their business. As a DevOps Engineer, you will own the infrastructure that everything else runs on — and at a scaling SaaS company, that matters a lot. When infra is broken, nothing ships. When it's well-built, the rest of the team barely thinks about it. That's the bar.
This is not a ticket-queue role. You will not be a service desk for developers. You will design, build, and evolve the platform that keeps Zenskar's systems reliable, fast, and secure — and you'll do it with a software engineer's mindset, not an IT admin's.
- Design and own cloud infrastructure end-to-end — from architecture decisions to production operations
- Build and maintain CI/CD pipelines that make shipping safe, fast, and boring (boring is good)
- Own the observability stack — make sure we know when something breaks before a customer does
- Drive infrastructure cost optimisation without compromising reliability or developer experience
- Work closely with backend engineers to make deployments, rollbacks, and incident response feel effortless
- Identify, document, and eliminate toil — if you're doing something manually more than twice, automate it
- Embed security and compliance thinking into infrastructure by default — not as a retrofit
- Be the person who asks "what happens when this fails?" before anyone else does
THE IMPACT YOU'LL MAKE
- Your infrastructure decisions will determine how reliably Zenskar's enterprise clients can run their business on our platform — downtime or data issues at this layer have direct consequences
- You will build the foundation that lets the engineering team ship faster without breaking things
- Your automation and tooling will compound over time — good work here multiplies everyone else's output
- You will be the person who turns "the infra is always on fire" into "infra just works" — and that shift has a real, visible impact on the company's velocity
** Key qualifications**
Must have:
- 3–5 years of hands-on DevOps, SRE, or Platform Engineering experience at a product company
- Strong Kubernetes experience in production — if you've debugged a CrashLoopBackOff at 2am and lived to tell the tale, you're in the right place
- Infrastructure-as-Code with Terraform — not just familiarity, but the ability to write, review, and refactor production-grade Terraform without hand-holding
- Deep AWS experience — ECS/EKS, Lambda, CloudWatch, IAM, VPC, and enough Cost Explorer to know where money goes when bills spike
- CI/CD ownership — you've built pipelines, not just used them; GitHub Actions, GitLab CI, or equivalent at real scale
- Can describe the hard infra problems you've solved, why they were hard, and what changed as a result — not just a list of tools on a resume
- Hands-on AWS ECS experience in production — task definitions, service scaling, capacity providers, deployment strategies, and circuit breakers; not just EC2 or generic container orchestration
- Lambda operations at scale — function lifecycle management, event source mapping, cold start tuning, and migrating Lambda-based workloads to more appropriate compute patterns as systems mature
- End-to-end observability ownership — alerting pipelines, custom metrics, structured log ingestion, and actually diagnosing production issues with the stack; not just setting up dashboards
- Secrets and credentials management in AWS — rotation policies, least-privilege access patterns, and the security hygiene that keeps them clean over time
Good to have:
- Scripting ability in Python or Go for automation and internal tooling — the kind of thing that saves a team hours every week
- Observability stack hands-on — Prometheus, Grafana, VictoriaMetrics, or Datadog in production; comfortable diagnosing issues across services, not just building dashboards
- Kustomize experience alongside Terraform for Kubernetes configuration management
- Apache Airflow or similar data pipeline infrastructure
- Security and compliance awareness — understands what SOC 2 means at the infra layer, not just on paper
- Cost optimisation wins you can point to — concrete numbers, concrete impact
- Experience building or maintaining an Internal Developer Portal (Backstage or similar)
- B2B SaaS or fintech background — multi-tenant systems, external integrations, enterprise reliability expectations
- Early-stage startup experience — comfortable when the runbook doesn't exist yet because you're writing it
- Self-hosted identity infrastructure (Keycloak, Okta, Auth0, or equivalent) — operational experience, not just integration
- Metrics-based autoscaling for worker fleets — scaling on queue depth or custom application metrics, not just CPU/memory
- Not taking yourself too seriously :)
WHAT DRIVES YOU:
- You treat infrastructure like software — version controlled, tested, reviewable, improvable
- You automate the thing that annoyed you last week — without being asked
- You own problems end-to-end: an incident isn't closed when the alert clears, it's closed when the postmortem is done and the fix is in
- You have opinions on the right way to build infra, but you're not precious about them — you change your mind when the tradeoffs change
- You thrive in environments where the answer to "what's the runbook for this?" is sometimes "write one"
Location
- Hybrid — 2 days per week in office
- Office Location: Indiranagar, Bengaluru
- Address: 3rd Floor, A Wing No 1, Carlton Towers, HAL Old Airport Rd, HAL 2nd Stage, Indiranagar, Bengaluru, Karnataka 560008
** Interview Process**
Our interview process is structured, transparent, and efficient:
- R0 – Recruiter Screening: Quick conversation to assess basic fit, motivation, and role expectations
- Round 1 – Introductory Chat: Focuses on your past experience, the infra problems you've owned, and how you think about reliability and developer experience. We recommend reviewing the job description & CEO's recorded videos before this step
- Round 2 – Technical Assessment & Discussion: Evaluates your system design instincts, infrastructure thinking, and how you approach real-world problems under constraints
- Reference Checks: We request contact details of two former direct managers. The hiring manager will connect with them to better understand your working style and how you operate under pressure
- Round 3: A final round-up of all the conversations
The process may vary slightly depending on whether we feel it would be useful for you to connect with additional members of the team
How to apply
Interested? Apply here → https://evolve.keka.com/careers/jobdetails/72378