2500000 - 3000000 INR - Yearly
Guwahati, Assam, India
Information Technology
Full-Time
Simplismart
Overview
About Simplismart
Simplismart is a GenAI inference platform to deploy, scale, and monitor any GenAI model (LLMs, speech, vision, or diffusion) across cloud or on-prem. Built for strict SLAs, enterprise-grade security, and full observability. Its modular design lets you optimize for cost or latency or auto-select the best topology per workload.
Role Overview
In this role, you will design and build the high-level architecture of Simplismart’s MLOps platform from the ground up, enabling scalable, reliable, and GPU-accelerated ML workflows across the product ecosystem.
What is expecetd from you-
- Design and implement the core architecture of a next-generation MLOps platform capable of running diverse GPU-accelerated workloads at scale.
- Formalise and standardize heterogeneous ML workloads—including LLM/VLM/ASR/diffusion pipelines and build orchestration abstractions for them.
- Build internal systems for continuous deployment of services, modules, and model pipelines across multi-cloud and hybrid environments.
- Create frameworks for high reliability, observability, and fault-tolerance for mission-critical inference, training, and data pipelines.
- Collaborate closely with Applied ML and Core ML teams to improve system reliability, latency, and cost efficiency.
- Develop internal tooling to benchmark, evaluate, and deploy models quickly and consistently.
- Ship production-grade code and infrastructure using strong engineering fundamentals and test-driven development (TDD), aligning with Simplismart’s engineering culture.
- Troubleshoot complex systems, performance bottlenecks, GPU behavior, and distributed workloads.
What We’re Looking For-
- Deep technical expertise in system design, distributed systems, and GPU-based ML workloads.
- Strong software engineering fundamentals (data structures, APIs, testing, debugging).
- Experience with infrastructure-as-code (Terraform, Ansible) and cloud platforms (AWS/GCP/Azure).
- Strong knowledge of ML fundamentals, model architectures (Transformers, CNNs), and inference behavior.
- Ability to build, maintain, and reason about multi-step pipelines (ETL → model → evaluation → deploy).
- Strong systems knowledge: Linux internals, networking, performance tuning, GPU memory behavior.
- Ability to work independently, own large ambiguous problems, and collaborate across teams.
- Excellent communication skills—able to articulate design decisions, tradeoffs, and system impacts clearly.
Good to Have-
- Experience with modern inference stacks such as TensorRT, Triton, vLLM/TGI, SGLang.
- Exposure to quantization, model optimization, or CUDA concepts.
- Hands-on experience with Llama/Mistral, Whisper, or Stable Diffusion pipelines.
- Familiarity with CI/CD, Docker, GitHub workflows, and IaC-driven deployments.
- Experience designing high-availability or fault-tolerant production systems.
Why Join Simplismart?
- Opportunity to define and lead the brand identity of a fast-growing GenAI company.
- Work closely with leadership on high-impact initiatives from global event campaigns to overall storytelling.
- Be part of a team that values design as a strategic lever, not just execution.
- Competitive compensation and growth opportunities in a high-energy startup environment.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in