Overview
Role Overview
We are seeking a Senior DevOps Engineer with 4+ years of experience to architect, manage, and optimize cloud-native infrastructure. The ideal candidate should have strong expertise in Kubernetes (EKS/GKE/AKS), CI/CD automation, GitOps workflows, cloud monitoring, and multi-cloud environments.
You will be responsible for deploying scalable infrastructure, managing databases and caches, implementing security practices, and driving DevOps automation across cloud platforms. This role requires experience with containerized workloads, infrastructure as code, observability tools, and collaboration with cross-functional teams.
Roles and Responsibilities
- Design and manage cloud infrastructure across AWS, Azure, or GCP, using Terraform or CLI tools
- Set up and maintain Kubernetes clusters (e.g., EKS, GKE, AKS) with autoscaling nodes
- Create and maintain Helm charts for services like FastAPI, PostgreSQL, MongoDB, Redis, and Elasticsearch
- Deploy managed services such as PostgreSQL (RDS/Azure SQL/GCP Cloud SQL), MongoDB (DocumentDB/Cosmos DB/Atlas), and Redis (ElastiCache/Azure Redis/Memorystore)
- Build CI/CD pipelines using GitHub Actions, integrating automated testing, containerization, and deployment
- Implement GitOps workflows using Argo CD, connected to GitHub
- Define and manage Kubernetes Deployments, Services, ConfigMaps, and Secrets
- Configure Ingress Controllers (e.g., ALB, NGINX) for secure routing
- Deploy and manage Elasticsearch/OpenSearch or equivalents on supported cloud platforms
- Set up observability stacks using Prometheus, Grafana, and Cloud-native logging solutions (CloudWatch, Stackdriver, Azure Monitor)
- Implement secrets management and access control using IAM, RBAC, and cloud-native tools (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault)
- Enable autoscaling for services using Horizontal Pod Autoscaler (HPA) and cloud-native autoscaling options
- Conduct performance and load testing using tools like k6 or Locust
- Collaborate with developers, AI engineers, and product managers for operational readiness
- Participate in Agile/Scrum workflows and lead infrastructure planning sessions
Requirements
Technical Skills:
Cloud & Infrastructure:
- Experience with at least one major cloud provider (AWS, Azure, or GCP), primarily AWS.
- Familiarity with core services: compute, storage, databases, networking, and IAM
- Infrastructure provisioning with Terraform, Pulumi, or cloud-native IaC
Containerization & Orchestration:
- Hands-on experience with Docker and Kubernetes (EKS/GKE/AKS)
- Proficiency in managing microservices using Helm and Ingress
CI/CD & GitOps:
- Strong knowledge of GitHub Actions, GitOps workflows, and Argo CD
Monitoring & Logging:
- Proficiency with Prometheus, Grafana, and platform-native tools (e.g., CloudWatch, Azure Monitor, GCP Stackdriver)
Databases & Caching:
- Familiarity with PostgreSQL, MongoDB, and Redis across cloud-managed platforms
Security & Secrets Management:
- Experience with IAM, RBAC, encrypted secrets, and tools like AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager
Testing & Optimization:
- Proficiency in load and performance testing using k6, Locust, or JMeter
- Generic DevOps Requirements
- Solid foundation in Linux system administration
- Strong understanding of networking, DNS, SSL/TLS, firewalls, and load balancers
- Familiarity with configuration management tools (e.g., Ansible, Puppet)
- Experience with event-driven architectures and message queues (e.g., RabbitMQ, Kafka)
- Ability to troubleshoot complex distributed systems
- Awareness of DevSecOps practices, including vulnerability scanning and shift-left security
- Strong scripting skills in Python, Bash, or Shell
- Knowledge of version control, branching, and release workflows (Git)
Soft Skills & Work Approach
- Strong analytical and troubleshooting skills with a focus on infrastructure scalability and reliability.
- Ability to collaborate effectively with cross-functional teams including full stack developers and AI engineers.
- Experience working in an Agile/Scrum environment, following DevOps best practices.
- Proactive mindset with a passion for automation and continuous improvement.
- Excellent communication and documentation skills.
Job Types: Full-time, Permanent
Pay: ₹359,854.04 - ₹1,553,198.28 per year
Benefits:
- Paid sick time
Schedule:
- Day shift
- Fixed shift
- Monday to Friday
Experience:
- Prometheus, Grafana, and platform-native tools: 3 years (Required)
Work Location: In person