Pune, Maharashtra, India
Information Technology
Full-Time
Cognizant
Overview
Job Summary
Site Reliability Engineer with proficiency in Cloud DevOps and Application observability
Responsibilities
SRE related certifications are preferred but not mandatory
Site Reliability Engineer with proficiency in Cloud DevOps and Application observability
Responsibilities
- Apply technical knowledge and problem-solving methodologies to projects of moderate scope with a focus on improving the data and systems running at scale and ensures end to end monitoring of applications
- Resolves most nuances and determines appropriate escalation path
- Build support Monitor and Automate web product on Private Cloud infrastructure
- Demonstrates and champions site reliability culture and practices and exerts technical influence throughout your team
- Drive initiatives to improve the reliability and stability of web Hosting platforms using data-driven analytics to improve service levels
- Collaborates with team members to identify comprehensive service level indicators and stakeholders to establish reasonable service level objectives and error budgets with customers
- Demonstrates a high level of technical expertise within one or more technical domains and proactively identifies and solves technology related bottlenecks in your areas of expertise
- Collaborates with technical experts key stakeholders and team members to resolve complex problems
- Provides comprehensive and ongoing guidance tools and solutions to support the firms growth
- Works toward becoming an expert on the applications and platforms under your influence while understanding their interdependencies and limitations
- Documents and shares knowledge within your organization via internal forums and communities of practice
- Strong knowledge of one or more infrastructure disciplines such as hardware networking terminology databases storage engineering deployment practices integration automation scaling resilience and performance assessments
- Experience with multiple cloud technologies with the ability to operate in and migrate across public and private clouds
- Drives to develop infrastructure engineering knowledge of additional domains data fluency and automation knowledge
- Cloud Exposure - Understanding and working experience and understanding of resiliency scalability observability monitoring etc
- Understanding of the Data Objects & Structure and write the queries using SQL based on tickets as needed
- Experience as SRE in complex and mission critical applications involving multitude of components of varying technical generations
- Deep proficiency in reliability scalability performance security enterprise system architecture toil reduction and other site reliability best practices with the ability to implement these practices within an application or platform
- Strong knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform
- Strong knowledge and experience in observability monitoring alerting and telemetry collection using tools such as Cloudwatch Grafana Dynatrace Prometheus Splunk etc
- Fluency in at least one programming language such as Python Terraform Ansible Java Spring Boot Shell Scripting DotNet etc
SRE related certifications are preferred but not mandatory
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in