Back to Jobs

2 Days ago

Site Reliability Engineer (SRE) - Founding Team

Apply Now

2000000 - 2500000 INR - Yearly

Mumbai, Maharashtra, India

Space Exploration & Research, Information Technology

Full-Time

Potpie AI

Overview

This is one of the first site reliability engineering position being opened at potpie 🥧, and we are excited to discover and onboard a great mind to work along with us. You'll be instrumental in building the robust, scalable, and resilient infrastructure required to build and deploy AI agents for engineering use cases like debugging, system design, and testing.

🏄🏻‍♀️ What is potpie?
Potpie 🥧 is an open-source platform that understands your codebase and helps you build use-case-specific AI agents for your developer workflows. We also provide users with ready-to-use agents for engineering use cases like debugging, low-level design, and testing.

📝 Responsibilities
As a Site Reliability Engineer at potpie, you will:

Design, implement, and maintain the core infrastructure and CI/CD pipelines to ensure high availability, scalability, and performance of the potpie platform and its AI agents.
Be responsible for observability (logging, monitoring, alerting) across the stack to proactively identify and resolve issues.
Automate deployment, scaling, and operational tasks using infrastructure-as-code (IaC) principles.
Collaborate closely with the backend and product teams to plan features and ensure new deployments meet reliability and performance standards.
Conduct system design reviews with a focus on reliability, fault tolerance, and disaster recovery.
Participate in on-call rotation to respond to and resolve critical production incidents efficiently.
Drive the adoption of best practices for security, performance optimization, and cost management within the cloud infrastructure.

🏆 Proof of Work & Qualifications
At potpie, we accept any meaningful project as proof of work. If you don’t have work experience but have an open-source project, that would count. We don’t measure your ability in terms of the years of experience you have, but sometimes years of experience can be a good proxy to project your capabilities.

✅ Must-haves

Expertise in Cloud Infrastructure (e.g., AWS, GCP, Azure), particularly managing and deploying applications at scale.
Strong practical experience with Kubernetes (e.g., GKE, EKS) and containerization technologies (Docker).
Solid understanding of SRE principles and practices, including SLOs, SLIs, error budgets, and post-mortem analysis.
Experience with Infrastructure-as-Code tools (e.g., Terraform, Ansible).
Proficiency in setting up and managing monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack, Datadog).
Strong scripting and automation skills in a language like Python or Go.
Familiarity with database operations and reliability (e.g., PostgreSQL, Redis, MongoDB).
Excellent debugging, problem-solving skills, and attention to detail, especially in distributed systems.

👌🏻 Good to have

Experience with AI/ML infrastructure or deploying LLM-based applications.
Tangible open-source contributions related to infrastructure, DevOps, or reliability.
Experience in a startup environment or building infrastructure from the ground up.
Strong knowledge of networking and security best practices for cloud-native applications.

🤝 Why should you consider working at potpie?
Working at potpie is the right bet for you if you can relate to the following:

You are tired of building the same standard infrastructure setups and are eager to tackle challenging technical problems related to scaling AI agents.
You want to do impactful work that has a positive change in the lives of fellow developers.
You have out-of-the-box ideas and want the autonomy to chase them.
You want to work across the stack on a fast-paced project from day one.
You want the opportunity to build the company culture you always wanted at work.
You aspire to build something of your own one day.

🧩 How we hire engineers at potpie?
Introductory call: A brief call to understand your background, expectations, and ambitions.

Assessment: A take-home assignment (48 hours) relevant to the SRE role, followed by a discussion with an engineering interviewer. This step may be skipped if there is substantial proof of work.

Technical Interview: The core technical round, assessing your system design, SRE knowledge, and algorithmic thinking. We will collaborate on designing a real-time solution to evaluate your fundamentals.

Pro-Tip: If you want to grab our attention, the best way is to find a bug in the application and raise a PR: https://github.com/potpie-ai/potpie

Share job

Similar Jobs

View All

3 Hours ago

Software Development Engineer – III (Erlang)

Information Technology

5 - 9 Yrs
Gurgaon / Gurugram

About the Role We are seeking a Software Development Engineer – III to design, develop, and optimize high-performance, distributed backend systems that power real-time, large-scale automation and orchestration platforms. This role is ideal for ...

More info

22 Hours ago

MDG Technical Developer

Aerospace & Defense

6 - 10 Yrs
Bangalore

Summary role description: Hiring MDG Technical Developer for a top global aerospace and defence innovator offering impactful, cutting-edge work. Company description: Our client is a leading global player in the aerospace and def...

More info

1 Day ago

Engineering Manager

Internet

8 - 13 Yrs
Bangalore

Key Responsibilities: ● Leadership & Strategy ○ Lead and grow a team of backend,and FE engineers focused on Search, Ranking, and Product Discovery. ○ Collaborate with Product, Data Engineering, and UX teams to define the long-term search roa...

More info

1 Day ago

Junior Android Developer

Information Technology

800000 - 1200000 INR - Annual
1 - 2 Yrs
Pune

Title: Android Developer Location: Pune (Hinjewadi Phase 1 - WFO) Experience: 0 - 2 Years We are hiring fresh graduates from premium engineering colleges for an exciting Android Developer opportunity with a global leader in aviation technolo...

More info

1 Day ago

Software Engineer in Delhi

Space Exploration & Research, Information Technology

Mumbai, Maharashtra, India

Key Responsibilities Design and develop computer vision and video analytics modules for real-time traffic and safety applications. Integrate AI/ML models using frameworks like OpenCV, TensorFlow, or PyTorch. Work with live camera feeds, GStreamer pip...

More info

1 Day ago

iOS Developer

Space Exploration & Research, Information Technology

Mumbai, Maharashtra, India

We are seeking a talented and passionate iOS Developer to join our growing mobile development team. The ideal candidate will have a strong understanding of the iOS platform, excellent proficiency in Swift and/or Objective-C, and a commitment to writi...

More info

1 Day ago

Senior Data Analyst - R/Python

Space Exploration & Research, Information Technology

Mumbai, Maharashtra, India

DescriptionWe are looking for an experienced and dynamic Data Analyst Lead to head our data analytics function. This role requires a blend of hands-on analytics expertise and leadership skills to guide a team of data analysts in delivering high-quali...

More info

1 Day ago

Senior DevOps Engineer - AWS & GCP (On-site)

Space Exploration & Research, Information Technology

Mumbai, Maharashtra, India

About us:Working at Tech Holding isn't just a job, it's an opportunity to be a part of something bigger. We are a full-service consulting firm that was founded on the premise of delivering predictable outcomes and high-quality solutions to our client...

More info

Talk to us

Feel free to call, email, or hit us up on our social media accounts.

Email info@antaltechjobs.in