Jaipur, Rajasthan, India
Information Technology
Full-Time
GoKwik
Overview
About GoKwik
GoKwik is a growth operating system designed to power D2C and eCommerce brands from checkout optimization and reducing return-to-origin (RTO), to payments, retention, and post-purchase engagement. Today, GoKwik enables over 12,000 merchants worldwide, processes around $2 billion in GMV, and is strengthening its AI-powered infrastructure. Backed by RTP Global, Z47, Peak XV, and Think Investments and bolstered by a $13 million growth round in June 2025 (total funding: $68 million), GoKwik is scaling aggressively across India, the UK, Europe, and the US.
Why This Role Matters
We are looking for an SRE-focused Engineer to join our DevOps team. This role is 80% Site Reliability Engineering and 20% DevOps enablement, with observability, resilience, and incident management at its core. You will lead on-call operations, build world-class observability systems, and drive reliability engineering practices across the organization. Alongside, you’ll also collaborate on automation and CI/CD improvements to ensure services are built and operated for scale. We are an engineering-focused team — continuously investing in tools, tests, processes, and technology. We consider our people to be our biggest asset and strive to build a culture of continuous learning and growth.
What You’ll Own
At GoKwik, we aren’t just building tools — we’re rewriting the playbook for eCommerce in India. We exist to solve some of the most complex challenges faced by digital-first brands: low conversion rates, high RTO, and poor post-purchase experience. Our checkout and conversion stack powers 500+ leading D2C brands and marketplaces — and we’re just getting started.
GoKwik is a growth operating system designed to power D2C and eCommerce brands from checkout optimization and reducing return-to-origin (RTO), to payments, retention, and post-purchase engagement. Today, GoKwik enables over 12,000 merchants worldwide, processes around $2 billion in GMV, and is strengthening its AI-powered infrastructure. Backed by RTP Global, Z47, Peak XV, and Think Investments and bolstered by a $13 million growth round in June 2025 (total funding: $68 million), GoKwik is scaling aggressively across India, the UK, Europe, and the US.
Why This Role Matters
We are looking for an SRE-focused Engineer to join our DevOps team. This role is 80% Site Reliability Engineering and 20% DevOps enablement, with observability, resilience, and incident management at its core. You will lead on-call operations, build world-class observability systems, and drive reliability engineering practices across the organization. Alongside, you’ll also collaborate on automation and CI/CD improvements to ensure services are built and operated for scale. We are an engineering-focused team — continuously investing in tools, tests, processes, and technology. We consider our people to be our biggest asset and strive to build a culture of continuous learning and growth.
What You’ll Own
- Lead SRE practices for reliability, scaling, and performance of production systems.
- Lead on-call operations and incident response, ensuring fast resolution and minimizing customer impact.
- Perform deep debugging of production issues across infra, services, and databases.
- Design and automate self-healing, scalable infrastructure.
- Architect and implement advanced observability (metrics, logs, traces, SLIs/SLOs, APM) to detect, debug, and prevent outages.
- Support CI/CD and infra automation (Terraform, Kubernetes, pipelines) as part of DevOps responsibilities (20%).
- Implement and mature observability practices (SLIs/SLOs, distributed tracing, APM).
- Mentor junior engineers in incident management and DevOps best practices.
- Partner with engineering teams on resilient architecture reviews.
- Commitment to continuous innovation by researching and proposing adoption of new tools and industry best practices to enhance infrastructure reliability.
- Conduct blameless postmortems, improve incident playbooks, and drive prevention culture.
- 5–8 years of experience in SRE / Production Engineering (with some DevOps exposure).
- Proven expertise in incident management, debugging distributed systems, and on-call operations.
- Strong background in observability platforms (Prometheus, Grafana, Datadog, OpenTelemetry, or similar).
- Deep knowledge of cloud infra (AWS/GCP) including networking, scaling, HA/DR.
- Hands-on with Kubernetes, Terraform, and CI/CD pipelines.
- Experience with incident frameworks, blameless postmortems, chaos/ resiliency testing.
- Ability to balance short-term firefighting with long-term reliability engineering.
- Strong scripting skills (Shell, Python, or Go preferred).
At GoKwik, we aren’t just building tools — we’re rewriting the playbook for eCommerce in India. We exist to solve some of the most complex challenges faced by digital-first brands: low conversion rates, high RTO, and poor post-purchase experience. Our checkout and conversion stack powers 500+ leading D2C brands and marketplaces — and we’re just getting started.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in