Mumbai, Maharashtra, India
Information Technology
Full-Time
UST
Overview
Role Description
We are seeking a highly skilled and proactive Site Reliability Engineer (SRE) to join our Cloud Operations team. The ideal candidate will bring deep expertise in AWS, Linux/Windows systems, DevOps, and Infrastructure as Code (IaC) to ensure the reliability, scalability, and security of cloud-hosted infrastructure and applications. This role offers the opportunity to work on mission-critical systems with a strong focus on automation, monitoring, and compliance.
Key Responsibilities
Aws,Linux,Troubleshooting
We are seeking a highly skilled and proactive Site Reliability Engineer (SRE) to join our Cloud Operations team. The ideal candidate will bring deep expertise in AWS, Linux/Windows systems, DevOps, and Infrastructure as Code (IaC) to ensure the reliability, scalability, and security of cloud-hosted infrastructure and applications. This role offers the opportunity to work on mission-critical systems with a strong focus on automation, monitoring, and compliance.
Key Responsibilities
- Design and implement resilient AWS infrastructure (RDS, IAM, CloudWatch, Amazon MQ, Route53).
- Manage and support Linux and Windows environments, ensuring uptime and performance.
- Develop and maintain Terraform/CloudFormation IaC templates for scalable provisioning.
- Monitor infrastructure health using CloudWatch, Prometheus, and Grafana, configuring s for SLA breaches and vulnerabilities.
- Troubleshoot complex infrastructure and application issues, perform root cause analysis, and implement preventive measures.
- Support and maintain Tomcat application servers with secure, efficient deployments.
- Conduct and support audits (ISO 27001, GDPR, IAM policies), ensuring governance and compliance.
- Enhance CI/CD pipelines and automate operational workflows in collaboration with DevOps.
- Manage DNS configurations (Route53), including bulk imports and troubleshooting.
- Maintain documentation for SOPs, DR coordination, patching, and release management.
- AWS Infrastructure: 5+ years
- Linux/Windows Administration: 3+ years
- Troubleshooting: 4+ years
- DevOps Practices: 3+ years
- Infrastructure as Code (IaC): 2+ years (Terraform, CloudFormation)
- AWS Services: RDS, IAM, CloudWatch, MQ, Route53
- Application Support: Tomcat
- Audit & Compliance: ITSM, ISO 27001, GDPR, IAM
- Bachelor’s degree in Computer Science, Information Technology, or related field.
- AWS certifications (SysOps Administrator, Solutions Architect) are a plus.
- Strong analytical, troubleshooting, and problem-solving skills.
- Excellent communication and documentation abilities.
- Experience with CloudEndure, Cloudberry, or similar DR tools.
- Familiarity with Vault Lock, WORM protection, cross-account KMS encryption.
- Exposure to containerized environments (ECS, EKS).
- Understanding of ITIL frameworks (incident/change management).
Aws,Linux,Troubleshooting
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in