Overview
Job Summary: We are seeking an experienced Senior DevOps Engineer to join our team, focusing on managing and optimising our on-premises infrastructure. This role is integral to ensuring the high availability, scalability, and security of our services. You will play a key role in designing, building, and maintaining our Kubernetes-based infrastructure, implementing CI/CD pipelines, and automating our processes to improve efficiency and reliability. If you have a strong background in on-prem infrastructure, Kubernetes, and Rancher, along with a passion for security and automation, we’d like to meet you.
Key Responsibilities:
DevOps Expertise:
● On-Prem Infrastructure: Design, manage, and optimise Kubernetes infrastructure using Rancher, ensuring robust, scalable, and secure deployments.
● Services Management: Implement, monitor, and optimise services to ensure high availability, performance, and security.
● Backup and Recovery: Implement and manage backup and recovery strategies for services and Kubernetes clusters, ensuring data integrity and service continuity.
CI/CD Pipeline:
● Implementation: Build and maintain CI/CD pipelines for automated software delivery and deployment.
● Containerization: Containerize applications using Docker and manage orchestration with Kubernetes.
● Monitoring & Automation: Implement monitoring solutions and automate processes to maintain system performance and reliability.
Security & Compliance:
● Security Best Practices: Enforce security best practices within Kubernetes environments, focusing on compliance and risk mitigation.
● Security Audits: Conduct regular security audits and vulnerability assessments to proactively address potential risks.
Collaboration & Documentation:
● Team Collaboration: Work closely with cross-functional teams, including developers and system administrators, to streamline development and operational processes.
● Documentation: Maintain comprehensive documentation of DevOps processes, configurations, infrastructure designs, and security protocols.
Infrastructure Optimization:
● Maintenance: Monitor and maintain the health of our infrastructure and applications using tools like Prometheus, Grafana, and Zabbix.
● Troubleshooting: Resolve complex issues to ensure system reliability and performance.
● Optimization: Continuously evaluate and improve infrastructure and deployment processes to enhance performance, security, and scalability.
Automation:
● Automation Strategies: Drive automation efforts across development and operations workflows, focusing on deployment strategies like Canary and Blue-Green to ensure safe and efficient releases.
● Efficiency Improvements: Automate manual processes to improve operational efficiency and reduce the potential for errors. Qualifications:
Education:
● Bachelor’s degree in Computer Science, Information Technology, or a related field.
Experience:
● 7+ years of experience as a DevOps Engineer, Site Reliability Engineer, or similar role.
● 4+ years of experience managing production Kubernetes clusters, particularly in on-premises environments.
● 3+ years of experience with cluster management platforms like Rancher.
● 3+ years of experience with on-prem infrastructure and services.
Technical Skills:
● Advanced proficiency with Linux systems and scripting languages such as Bash.
● Strong experience with CI/CD tools such as GitLab CI/CD, Jenkins, FluxCD, or ArgoCD.
● Experience with Infrastructure as Code (IAC) tools like Terraform or Ansible.
● Expertise in database services including Postgres, MongoDB, and MySQL.
● Proficiency in containerization technologies (Docker, Kubernetes).
● Experience with monitoring tools like Prometheus, Grafana, and Zabbix.
● Strong understanding of networking concepts and protocols within on-premises and Kubernetes environments.
● Familiarity with security and compliance best practices, especially within on-prem infrastructure.
● Experience with backup and recovery solutions for services and Kubernetes clusters.
Soft Skills:
● Excellent problem-solving and troubleshooting abilities.
● Strong communication and interpersonal skills, with the ability to collaborate effectively across teams.
Job Type: Full-time
Pay: ₹800,000.00 - ₹1,000,000.00 per year
Benefits:
- Health insurance
- Life insurance
- Provident Fund
Schedule:
- Monday to Friday
Ability to commute/relocate:
- Malad, Mumbai, Maharashtra: Reliably commute or planning to relocate before starting work (Preferred)
Experience:
- total work: 5 years (Preferred)
Work Location: In person