Overview
Senior DevOps Engineer — Lyearn
Lyearn brings together essential tools for OKR management, performance tracking, employee engagement, and a standout Learning Management System that enables continuous upskilling and growth. Our platform powers organizations that believe learning should be embedded into everyday work — reliable infrastructure is what makes that possible.
As a Senior DevOps Engineer, you’ll design, operate, and evolve the cloud systems that keep Lyearn secure, scalable, and resilient. This role blends platform engineering, automation, SRE practices, and cloud architecture — ideal for engineers who love building reliable systems at scale and influencing how teams ship software.
What you will do
Platform & Infrastructure Ownership
- Architect and manage multi-cloud environments (AWS & GCP) with a focus on scalability, security, and cost efficiency.
- Define and evolve our infrastructure roadmap, aligning reliability with product growth needs.
- Implement Infrastructure as Code using Terraform/CloudFormation for consistent, repeatable environments.
- Design high-availability topologies, disaster-recovery plans, and automated backup strategies.
CI/CD, Automation & Developer Enablement
- Build and maintain automated CI/CD pipelines (GitHub Actions, Jenkins) to support frequent, reliable releases.
- Standardize build, test, and deployment workflows across engineering teams.
- Create self-service DevOps tooling that accelerates developer productivity.
- Enforce deployment safety with staged rollouts, canaries, and rollback strategies.
Containers, Orchestration & Observability
- Deploy and operate containerized workloads using Docker and Kubernetes across environments.
- Implement strong monitoring, logging, and alerting using Prometheus, Grafana, and centralized logging stacks.
- Establish SLOs, SLAs, and error-budget practices to drive reliability.
- Lead incident response, post-mortems, and continuous reliability improvements.
Security, Compliance & Governance
- Implement cloud security best practices including IAM, secrets management, and network isolation.
- Harden clusters, pipelines, and runtime environments.
- Support compliance-ready controls for training, audits, and data privacy.
- Continuously assess vulnerabilities and automate remediation workflows.
AI Platform & Advanced Capabilities
- Support infrastructure powering LLM-driven features, vector search, and RAG pipelines.
- Optimize performance and scalability of AI-related services in production.
- Collaborate with product and AI teams to safely roll out intelligent capabilities.
Who you are
Technical Expertise
- 3+ years in DevOps, SRE, or Platform Engineering roles running production systems.
- Deep experience with AWS and GCP architectures and cost optimization.
- Strong hands-on expertise with Kubernetes, Docker, and container orchestration.
- Proven experience building CI/CD pipelines (Jenkins, GitHub Actions).
- Solid understanding of networking, load balancing, and distributed systems fundamentals.
- Proficiency with Infrastructure as Code (Terraform/CloudFormation).
- Experience implementing observability stacks and incident management processes.
- Familiarity with databases, caching layers, and message queues from an operations standpoint.
Collaboration & Leadership
- Comfortable partnering with engineering teams to define standards and best practices.
- Strong communication skills — able to explain trade-offs clearly.
- Pragmatic problem solver who balances speed with safety and reliability.
Bonus: AI/ML Platform Exposure
- Experience supporting systems that use LLM APIs or vector databases.
- Understanding of RAG pipelines and data-retrieval infrastructure.
What makes this role special
You’ll play a central role in building the platform foundation behind an AI-driven learning product used by organizations worldwide. Your work will directly influence reliability, developer velocity, and user trust. You’ll own meaningful systems end-to-end, while shaping best practices and mentoring others.
Work at Lyearn
If building resilient, scalable infrastructure excites you, we’d love to talk.