Information Technology
Full-Time
UST
Overview
Job Summary
We are seeking a highly skilled DevOps Engineer with strong expertise in Azure Cloud, Kubernetes, Infrastructure as Code, CI/CD automation, and cloud security practices. The ideal candidate will also bring hands-on experience in MLOps and LLMOps, supporting model deployment, automation pipelines, and GPU-based workload optimization.
Key Responsibilities
Cloud & Infrastructure
- Design, implement, and manage DevOps solutions in Microsoft Azure.
- Demonstrate strong understanding of Azure cloud infrastructure services and their limitations.
- Configure and monitor application performance, scaling strategies, and auto scale-up/scale-down mechanisms.
- Provide best practices for provisioning production and non-production environments to optimize cloud usage and costs.
- Proactively identify and remediate vulnerabilities within the Azure ecosystem.
- Manage secure credential handling using tools such as HashiCorp Vault or Azure Key Vault.
Containerization & Kubernetes
- Manage and scale production-grade Kubernetes clusters (AKS / OpenShift).
- Develop and implement containerization strategies using Docker and Kubernetes.
- Implement and manage serverless solutions within Azure.
- Ensure strong understanding of microservices architecture and API communication in containerized environments.
CI/CD & GitOps
- Implement declarative, automated application deployments using GitOps principles with ArgoCD.
- Maintain and enhance CI/CD pipelines using Bamboo and Tekton.
- Configure webhooks and build triggers for CI pipelines.
- Integrate automated security scanning (SAST/DAST) and container image signing into the build process.
Infrastructure as Code
- Develop and manage infrastructure using Terraform.
Security & Monitoring
- Implement security best practices across cloud infrastructure.
- Work with system, security, and network monitoring tools.
- Ensure compliance and continuous monitoring of infrastructure and applications.
Operations & Reliability
- Demonstrate strong understanding of application architecture and common failure modes.
- Support 24x7 application operations including:
- Incident Management
- Change Management
- Capacity Management
MLOps & LLMOps Responsibilities
- Model Deployment: Build and manage infrastructure to support LLM serving, fine-tuning, and monitoring.
- Pipeline Automation: Apply MLOps principles to automate data versioning and model training workflows.
- Resource Optimization: Manage GPU-intensive workloads and optimize compute costs within Azure environments.
Required Skills & Qualifications
- Proven experience in DevOps within Azure Cloud environments.
- Hands-on expertise in Kubernetes (AKS/OpenShift), Docker, and serverless architectures.
- Strong experience with Terraform (Infrastructure as Code).
- Experience with CI/CD tools such as Bamboo, Tekton, and GitOps tools like ArgoCD.
- Understanding of cloud security, vulnerability management, and automated security testing (SAST/DAST).
- Experience handling production environments and large-scale distributed systems.
- Exposure to MLOps / LLMOps practices is highly preferred.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in