Overview
About Vaanee AI
Vaanee AI is an advanced speech technology company building next-generation solutions in multilingual voice cloning, AI-based dubbing, audio post-production automation, and intelligent media workflows. We operate at the intersection of deep learning, high-performance backend systems, and scalable infrastructure to deliver mission-critical, production-grade systems for the media and entertainment industry.
We are looking for a Senior Python Developer who demonstrates mastery over system-level programming constructs, cloud-native design, asynchronous APIs, and high-throughput model inferencing. You will be a part of a highly technical, performance-obsessed engineering team building products for the real-world scalability of artificial intelligence.
Role Summary
The Senior Python Developer will be responsible for architecting and implementing backend services that interface directly with machine learning models and handle high-concurrency workloads. The role requires end-to-end ownership of services, including performance optimization, containerized deployments, and multi-threaded processing pipelines.
Key Responsibilities
- Architect and implement RESTful APIs using Django, Flask, or FastAPI from scratch.
- Build scalable and thread-safe services to handle model inferencing and data processing.
- Design and manage distributed job queues using RabbitMQ, SQS, or ActiveMQ.
- Leverage multi-processing and asynchronous patterns to ensure application responsiveness.
- Integrate cloud-native services from AWS, Azure, or GCP for compute, queue, and storage management.
- Deploy containerized applications using Docker and orchestrate via Kubernetes.
- Apply performance profiling techniques to optimize memory, latency, and throughput.
- Develop robust CI/CD pipelines with linting, testing, and integration stages.
- Work closely with AI/ML engineers to productionize models.
- Implement GPU offloading and concurrency management in hybrid compute environments.
- Maintain exceptional code quality, documentation, and system-level reliability.
Minimum Qualifications
- Bachelor’s degree in Computer Science, Engineering, or related discipline.
- 4+ years of Python programming experience, with production-grade backend development.
- Proven expertise in multithreading, multiprocessing, and asynchronous programming.
- Demonstrable experience with one or more Python frameworks (FastAPI, Flask, Django).
- Strong understanding of distributed systems and event-driven architecture.
- Proficient with Docker, Kubernetes, and deployment pipelines.
- Familiarity with REST API security, caching, and throttling strategies.
- Exposure to high-performance GPU applications and memory management.
- Experience with RabbitMQ, ActiveMQ, or Amazon SQS/SNS.
- Prior work on high-throughput, low-latency systems preferred.
Preferred Skills
- Experience with CUDA, TensorRT, or Triton Inference Server.
- Understanding of advanced Python features: memoryview, ctypes, asyncio, and context managers.
- Prior exposure to MLOps and serving ML models in production.
- Familiarity with microservices architecture, API gateway management, and observability tools (Prometheus, Grafana).
- Ability to write and maintain performance benchmarks and test suites.
Benefits
- Competitive compensation package with ESOP opportunities.
- Remote-first environment with flexible work hours.
- Exposure to cutting-edge AI systems and tools.
- Opportunity to work with an elite team of AI, infrastructure, and media professionals.
- Access to wellness programs, including confidential support systems.
Job Types: Full-time, Permanent
Pay: ₹12,885.50 - ₹64,902.56 per month
Benefits:
- Work from home
Schedule:
- Day shift
Supplemental Pay:
- Performance bonus
Application Question(s):
- In Python, what’s the difference between multithreading and multiprocessing? Why might multithreading fail to speed up a CPU-bound task in CPython?
- Explain the steps and key components involved in developing a secure REST API using FastAPI. What precautions would you implement in production?
- A Docker container running your Python API crashes after 30 seconds in Kubernetes, but runs locally. What are your first 3 debugging steps?
Work Location: Remote