Pune, Maharashtra, India
Information Technology
Full-Time
AiFA Labs
Overview
Key Responsibilities
- Design, implement, and maintain scalable and secure cloud infrastructure to support AI/ML model training and deployment.
- Automate the provisioning, deployment, monitoring, and management of infrastructure and services.
- Build and maintain CI/CD pipelines for both traditional software and machine learning models (MLOps).
- Implement infrastructure-as-code (IaC) using tools like Terraform, Ansible, or CloudFormation.
- Ensure system reliability and availability through monitoring, logging, alerting, and incident response.
- Manage containerization and orchestration using Docker and Kubernetes.
- Ensure security best practices in all aspects of the infrastructure (cloud, containers, pipelines).
- Optimize resource usage and cost efficiency in cloud environments.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 3+ years of experience in DevOps, Cloud Engineering, or a similar role.
- Hands-on experience with AWS, GCP, or Azure (AI/ML services experience preferred).
- Proficiency in scripting languages like Python, Bash, and Shell.
- Advanced Linux administration and troubleshooting skills.
- Medium-level Shell scripting or Windows PowerShell scripting skills (automation, monitoring, and system tasks).
- Experience with CI/CD tools such as Jenkins, GitHub Actions.
- Strong knowledge of Docker, Kubernetes, and Helm.
- Experience with monitoring/logging tools (e.g., Prometheus, Grafana, ELK stack).
- Experience with setting up cloud alerting systems (e.g., SMS, Billing alerts).
- Understanding of networking, security best practices, and system architecture.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in