Free cookie consent management tool by TermsFeed Site Reliability Engineer (SRE) | Antal Tech Jobs
Back to Jobs
2 Days ago

Site Reliability Engineer (SRE)

decor
600000 - 1800000 INR - Yearly
Pune, Maharashtra, India
Information Technology
Full-Time
Neverinstall

Overview

What You'll Do

  • Design and implement comprehensive monitoring and alerting systems
  • Build observability for our distributed architecture (streaming servers, microservices, orchestration)
  • Respond to and resolve service outages with focus on rapid recovery
  • Create runbooks, incident response procedures, and post-mortem processes
  • Implement SLI/SLO frameworks for enterprise customer SLA compliance
  • Monitor and optimize performance across multiple cloud providers (Azure, OCI)
  • Build automation for deployment, scaling, and recovery processes
  • Design disaster recovery and business continuity procedures
  • Work with engineering teams to improve system reliability and reduce MTTR
  • Implement chaos engineering and reliability testing practices

What We're Looking For

  • 3+ years of SRE, DevOps, or production systems experience
  • Strong experience with monitoring tools (Prometheus, Grafana, ELK, or similar)
  • Experience with incident management and on-call responsibilities
  • Knowledge of distributed systems reliability patterns and practices
  • Understanding of cloud platforms (Azure, AWS, GCP, OCI) and their monitoring tools
  • Experience with Kubernetes, container orchestration, and microservices
  • Scripting and automation skills (Python, Go, Bash, or similar)
  • Strong troubleshooting and debugging skills across the full stack

Nice to Have

  • Experience with enterprise SLA management and reporting
  • Knowledge of streaming protocols, real-time systems, or VDI platforms
  • Experience with multi-cloud architectures and failover strategies
  • Understanding of network protocols and performance optimization
  • Background in high-availability systems or financial services
  • Experience with infrastructure as code and GitOps practices
  • Knowledge of security monitoring and compliance frameworks

Key Responsibilities Include

  • Incident Response: Lead major incident response and coordinate cross-team resolution
  • Monitoring & Alerting: Build comprehensive observability across our streaming and orchestration services
  • Reliability Engineering: Work with development teams to build reliability into new features
  • Performance Optimization: Monitor and optimize system performance for enterprise workloads
  • Documentation: Create and maintain runbooks, troubleshooting guides, and operational procedures
Share job
Similar Jobs
View All
3 Hours ago
LLM Engineer
AI & Machine Learning Advancement
  • 2 - 5 Yrs
  • Bangalore
Design and implement LLM-powered features across product suite. Build internal tools, APIs, and utilities for scalable LLM integration. Collaborate with prompt engineers on modular workflows (RAG, chaining, structured prompting). Evaluate mo...
decor
4 Hours ago
Senior QA Engineer
Information Technology
  • 20 - 25 INR - Annual
  • 6 - 12 Yrs
  • Chennai
Total: 6+ years Need from Product Company Positions: 4 Duration: 12 Months (extendable) Work Mode: Hybrid Location: Chennai Skills: Playwright is first preference if not Selenium, Postman, REST Assured required. 50% Automation and 5...
decor
1 Day ago
UI/UX DEVELOPER
Information Technology
  • Pune, Maharashtra, India
Job Description : We are looking for an innovative and skilled UI/UX Developer with 1-3 years of experience to help design and implement user interfaces and experiences across web and mobile platforms. The ideal candidate will be responsible for tran...
decor
1 Day ago
Technical Lead - Python
Information Technology
  • Pune, Maharashtra, India
DescriptionPre-requisites : Bachelor's degree (Engineering, Computer Science Preferable) Primary, hands-on work experience as a Python Developer. 510 years of relevant experience in Python development, automation, or backend engineering. Proven expe...
decor
1 Day ago
VAYUZ Technologies - Technical Lead - Platform Services
Information Technology
  • Pune, Maharashtra, India
DescriptionResponsibilities :Primary Responsibilities Deliver infrastructure support across Windows, RHEL, Storage, Virtualization, IaC, and Cloud platforms. Design and provision infrastructure services to meet business needs. Manage storage operatio...
decor
1 Day ago
C++ Teamcenter Developer - PLM Ecosystem
Information Technology
  • Pune, Maharashtra, India
Job OverviewWe are looking for a skilled C++ & Teamcenter Developer to join our dynamic team in Pune. The ideal candidate will have strong expertise in C++ programming along with hands-on experience or exposure to Teamcenter. You will be responsible ...
decor
1 Day ago
System Software Engineer - DDR/LPDDR Protocols
Information Technology
  • Pune, Maharashtra, India
DescriptionJob Title : System Software EngineerExperience : 8+ YearsLocation : Hyderabad (On-site)Skill Set : DDR (Double Data Rate), LPDDR (Low Power DDR), Memory Controller, Embedded SystemsJob DescriptionWe are seeking an experienced System Softwa...
decor
1 Day ago
Winjit Technologies - Data Engineer - ETL/PySpark
Information Technology
  • Pune, Maharashtra, India
Role OverviewWe are seeking a highly skilled Data Engineer with deep hands-on expertise in SQL and PySpark, strong fundamentals in data architecture, and proven experience in building scalable ETL pipelines.You will be responsible for designing, deve...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media