Back to Jobs

4 Days ago

SRE Principal Engineer - Technical Lead

Apply Now

Bangalore, Karnataka, India

Information Technology

Full-Time

GSPANN Technologies, Inc

Overview

Change Management, Incident Response, Dynatrace, Grafana, Splunk, Datadog, Grafana, New Relic, Azure, Python, CI/CD/CT Pipeline, Kubernetes, Docker, Ansible, DevOps, Terraform, DevOps, Root Cause Analysis (RCA), SLO/SLAs Monitoring, E2E Implementation

Description

GSPANN is hiring a Principal Engineer – Technical Lead for Site Reliability Engineering (SRE) to lead reliability engineering initiatives in Pune or Hyderabad. This full-time role focuses on driving enterprise-wide observability, automation, and infrastructure optimization across global production systems.

Location: Pune / Hyderabad

Role Type: Full Time

Published On: 2 June 2025

Experience: 12 - 15 Years

Share this job

Description

GSPANN is hiring a Principal Engineer – Technical Lead for Site Reliability Engineering (SRE) to lead reliability engineering initiatives in Pune or Hyderabad. This full-time role focuses on driving enterprise-wide observability, automation, and infrastructure optimization across global production systems.

Role and Responsibilities

Demonstrate deep expertise in monitoring and observability tools such as Dynatrace, Splunk, Datadog, Grafana, and New Relic.
Apply modern observability practices and tools across enterprise environments.
Resolve organizational gaps in SRE implementation by designing scalable, long-term solutions.
Lead cross-functional initiatives to adopt emerging technologies and reliability frameworks.
Influence senior leadership on strategic decisions related to tooling, observability, and transformation.
Analyze complex system issues, uncover performance bottlenecks, and drive root cause resolution.
Drive automation and foster a culture of continuous improvement aligned with evolving technology trends.
Manage cloud infrastructure efficiently, with a strong preference for Microsoft Azure experience.
Write automation scripts proficiently, preferably using Python.
Work with cloud deployment tools including Ansible, Terraform, and Azure DevOps.
Architect and operate containerized environments using Kubernetes and Docker.
Utilize configuration management solutions such as Chef, Ansible, and AWS CodeDeploy.
Implement and optimize Continuous Integration/Continuous Deployment (CI/CD) pipelines using tools like GitLab, Jenkins, Bamboo, Travis CI, and CircleCI.
Solve technical issues independently and deliver sustainable solutions with minimal supervision.
Lead change and incident management processes, while driving strategic SRE transformation at scale.
Standardize observability across teams with end-to-end (E2E) implementation and innovative approaches.
Champion enterprise-grade monitoring strategies using industry-leading tools.
Build scalable infrastructure using Infrastructure as Code (IaC) principles and technologies.
Exhibit soft skills such as visionary thinking, proactive leadership, and deep-rooted troubleshooting expertise.
Define, implement, and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
Coordinate and lead incident response while conducting thorough Root Cause Analysis (RCA).

Skills And Experience

Bachelor's degree in Computer Science, Information Science, Engineering, or a related field.
12+ years of experience in Site Reliability Engineering (SRE) or DevOps roles, with a strong focus on managing production systems.
Ensure high availability, low latency, optimal performance, and cost-efficient operations for global e-commerce platforms.
Spearhead change and incident management across business-critical systems.
Mentor and guide product teams in embedding observability and operational excellence throughout the delivery pipeline.
Architect and deploy unified, end-to-end observability dashboards tailored for engineering and business stakeholders.
Define instrumentation standards and build reusable patterns to scale best practices across teams.
Collaborate with cross-functional stakeholders to integrate reliability into every stage of product development.
Develop proprietary tools that close gaps in software delivery and incident response.
Lead the adoption of SRE best practices to systematically improve resilience and uptime.
Automate key operations to ensure rapid and effective incident handling.
Monitor and enforce compliance with SLOs and ensure uninterrupted availability of mission-critical services.
Continuously optimize infrastructure to lower operational costs and seamlessly manage demand surges.

Share job

Similar Jobs

View All

15 Hours ago

Java Developer – Payments Domain

Information Technology

4 - 7 Yrs
Mumbai (All Areas)

We are hiring Java Developers with 4–6 years of hands-on experience in backend development, particularly within the Payments or FinTech domain. The ideal candidate should possess a strong foundation in Java technologies and be capable of working in a...

More info

16 Hours ago

SAP Functional Architect

Information Technology

40,00,000 - 45,00,000 INR - Annual
12 - 15 Yrs
Bangalore, Chennai

We are seeking an experienced SAP Pre-Sales Architect with a strong functional background and deep expertise in at least one SAP functional area. The ideal candidate will have extensive knowledge of cross-module integrations and a proven track record...

More info

17 Hours ago

Senior React Native Developer

Information Technology

7 - 12 Yrs
Jaipur

The NineHertz is on the lookout for a Senior React Native Developer who is passionate about mobile app development and thrives in a fast-paced environment. This is a fantastic opportunity to work with a dynamic team, drive innovation, and help delive...

More info

19 Hours ago

Senior Data & AI Analytics Engineer (Remote)

AI & Machine Learning Advancement

18,00,000 - 24,00,000 INR - Annual
5 - 8 Yrs
Pune

Job Ref: NT-DAAI-003 Experience: 5–8 years Client: A prestigious AI-first tech company Notice: Early joiners preferred (Immediate- 30 days) We are hiring on behalf of a prestigious AI-first technology client for a Senior Data & AI Analytics En...

More info

19 Hours ago

AI Engineering Manager (Remote)

Information Technology

40,00,000 - 50,00,000 INR - Annual
10 - 15 Yrs
Pune

Experience: 10 to 15 years Location: Remote Notice Period: Immediate to 30 days preferred Client: Leading mid-sized firm specializing in AI-driven solutions Overview: We are looking for an AI Engineering Manager to lead a dynamic team of ...

More info

20 Hours ago

Senior Generative AI Engineer

Information Technology

6 - 10 Yrs
Anywhere in India/Multiple Locations

Experience: 6 to 10 relevent years Location: Remote Notice Period: Immediate to 30 days preferred Client: India based prestigious enterprise in the AI domain Overview: We are seeking a seasoned Generative AI Engineer to spearhead the devel...

More info

2 Days ago

QA Engineer (Manual & Automation Testing)

Information Technology

Noida, Uttar Pradesh, India

About 23 Ventures 23 Ventures specializes in building technology to help startups and early-stage ideas achieve product-market fit, scale, and stay focused. We partner with startups and early-stage ideas to provide resources, practical advice, and e...

More info

2 Days ago

Senior Full Stack Developer - Node.js/Express.js

Information Technology

Noida, Uttar Pradesh, India

Job OverviewWe are looking for a Full-Stack Developer with 4+ years of experience in software development.ResponsibilitiesThe ideal candidate will be proficient in both frontend and backend technologies, capable of building scalable and high-perform...

More info

Talk to us

Feel free to call, email, or hit us up on our social media accounts.

Email info@antaltechjobs.in