Free cookie consent management tool by TermsFeed Senior Software Engineer - Infrastructure | Antal Tech Jobs
Back to Jobs
2 Days ago

Senior Software Engineer - Infrastructure

decor
Mumbai, Maharashtra, India
Information Technology
Full-Time
Lexsi Labs

Overview

Lexsi Labs is one of the leading frontier labs focused on building aligned, interpretable, and safe Superintelligence. Our work spans efficient alignment methods, interpretability-led system design, and scalable AI platforms that operate reliably across enterprise and regulated environments. Our mission is to build AI systems that are powerful, transparent, and production-grade by design.

Our team operates with deep technical ownership, minimal hierarchy, and a strong bias toward building systems that work at scale. At Lexsi.ai, infrastructure is not support work. It is a core product capability.

As a Senior Infrastructure Software Engineer, you will architect, build, and operate the core backend and deployment systems that power the Lexsi AI platform. You will own multi-cloud, serverless, and stateless AI deployments, ensuring our systems scale seamlessly across environments of any size while maintaining correctness, reliability, and cost efficiency.

This role is ideal for someone who thinks like SDE + Platform Engineer + DevOps, and takes pride in owning systems end-to-end.

Responsibilities

  • Design and build Python-based backend services that power core platform functionality and AI workflows.
  • Architect and operate AI/LLM inference and serving infrastructure at production scale.
  • Build stateless, serverless, and horizontally scalable systems that can run across environments of varying sizes.
  • Design multi-cloud infrastructure across AWS, Azure, and GCP with portability and reliability as first-class goals.
  • Deploy and manage containerized workloads using Docker, Kubernetes, ECS, or equivalent systems.
  • Build and operate distributed compute systems for AI workloads, including inference-heavy and RL-style execution patterns.
  • Implement Infrastructure as Code using Terraform, CloudFormation, Pulumi, or similar tools.
  • Own CI/CD pipelines for backend services, infrastructure, and AI workloads.
  • Optimize GPU and compute usage for performance, cost, batching, and autoscaling.
  • Define and enforce reliability standards targeting 99%+ uptime across critical services.
  • Build observability systems for latency, throughput, failures, and resource utilization.
  • Implement security best practices across IAM, networking, secrets, and encryption.
  • Support compliance requirements (SOC 2, ISO, HIPAA) through system design and evidence-ready infrastructure.
  • Lead incident response, root cause analysis, and long-term reliability improvements.
  • Collaborate closely with ML engineers, product, and leadership to translate AI requirements into infrastructure design.

Required Qualifications

  • 3+ years of hands-on experience in backend engineering, platform engineering, DevOps, or infrastructure-focused SDE roles.
  • Strong Python expertise with experience building and running production backend services.
  • Experience with Python frameworks such as FastAPI, Django, Flask, or equivalent.
  • Deep hands-on experience with Docker and Kubernetes in production environments.
  • Practical experience designing and operating multi-cloud infrastructure.
  • Strong understanding of Infrastructure as Code and declarative infrastructure workflows.
  • Experience building and maintaining CI/CD pipelines for complex systems.
  • Solid understanding of distributed systems, async processing, and cloud networking.
  • Strong ownership mindset with the ability to build, run, debug, and improve systems end-to-end.

Nice to Have

  • Experience deploying AI or LLM workloads in production environments.
  • Familiarity with model serving frameworks such as KServe, Kubeflow, Ray etc.
  • Experience running GPU workloads, inference batching, and rollout strategies.
  • Exposure to serverless or hybrid serverless architectures for AI systems.
  • Experience implementing SLO/SLA-driven reliability and monitoring strategies.
  • Prior involvement in security audits or compliance-driven infrastructure work.
  • Contributions to open-source infrastructure or platform projects.
  • Strong system design documentation and architectural reasoning skills.

What Success Looks Like

  • Lexsi’s AI platform scales cleanly across cloud providers and deployment sizes.
  • Inference systems are reliable, observable, and cost-efficient under real load.
  • Engineering teams ship faster because infrastructure is predictable and well-designed.
  • Incidents are rare, understood deeply when they occur, and lead to durable fixes.
  • Infrastructure decisions support long-term platform scalability, not short-term hacks.

Next Steps & Interview Process

  • Take-home assignment focused on real infrastructure and scaling problems.
  • Deep technical interview covering system design, failure modes, and trade-offs.
  • Final discussion focused on ownership, reliability mindset, and execution style.

We avoid process theatre. If you can design systems that don’t fall apart under pressure, we’ll move fast.

Share job
Similar Jobs
View All
3 Hours ago
DevOps Engineer - GCP, AWS
Fintech
  • 3 - 6 Yrs
  • Mumbai
About our Client We are on a mission to make solar simple, smart, and accessible for every home and business in India. Backed by innovation and purpose, we enable residential and MSME clients to go solar and save on their electricity expenses thro...
decor
4 Hours ago
Senior Director and Head - Technology Delivery
Information Technology
  • 4000000 - 4500000 INR - Annual
  • 20 - 25 Yrs
  • Chennai
Summary role description: Hiring for a Senior Director & Technical Delivery Head – Large Program Management/Delivery, for a scale-up mid-tier IT Platforms & Services client. Company description: Our client offers a wide range of...
decor
1 Day ago
Magneto -Technical Lead-App Development
Information Technology
  • Mumbai, Maharashtra, India
Area(s) of responsibilityJob OverviewWe are seeking a highly skilled Magento Lead to join our dynamic team. The successful candidate will be responsible for overseeing all aspects of Magento development, including planning, designing, and implementin...
decor
1 Day ago
Performance Tester
Information Technology
  • Mumbai, Maharashtra, India
ResponsibilitiesA day in the life of an Infoscion • As part of the Infosys delivery team, your primary role would be to interface with the client for quality assurance, issue resolution and ensuring high customer satisfaction. • You will understand...
decor
1 Day ago
Sr. Dot Net Developer
Information Technology
  • Mumbai, Maharashtra, India
Key Responsibilities✅ Design, develop, and maintain applications using .NET Core, C#, and ASP.NET Core (Razor Pages)✅ Build and integrate RESTful APIs with third-party services✅ Work with Entity Framework Core and SQL Server✅ Ensure application perfo...
decor
1 Day ago
Senior Test Engineer - 4 (H/F) - SAFRAN ENGINEERING SERVICES INDIA PVT LTD
Information Technology
  • Mumbai, Maharashtra, India
Safran est un groupe international de haute technologie opérant dans les domaines de l'aéronautique (propulsion, équipements et intérieurs), de l'espace et de la défense. Sa mission : contribuer durablement à un monde plus sûr, où le transport aérien...
decor
1 Day ago
Technical Lead
Information Technology
  • Mumbai, Maharashtra, India
Job DescriptionSenior MLOps / LLMOps Engineer (Databricks Expert) - Job DescriptionIntroductionJoin an amazing company where you can work with cutting-edge technologies and platforms. Give your career an Infinite edge, with a stimulating environment ...
decor
1 Day ago
Android Developer
Information Technology
  • Mumbai, Maharashtra, India
About UsBeatRoute is the world's only Goal-Driven AI platform for retail sales and distribution. It is an enterprise-grade, scalable platform that uses a unique Goal-Driven AI to deliver measurable business impact for brands in their retail sales a...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media