Overview
Business Area:Engineering
Seniority Level:
Mid-Senior level
Job Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
Cloudera is seeking a Senior Software Engineer to join our Bangalore-based SDX Observability Infrastructure team. This is a growth opportunity for a technically proficient engineer who enjoys the challenge of building next-generation telemetry systems while ensuring the continued reliability of mission-critical data lake services.
You will be a key individual contributor in a high-performing team. Your mission is twofold: leading the technical expansion of our cross-service Observability & Metrics infrastructure (75% focus) and acting as the technical steward for the SDX Backup and Restore (BDR) framework through maintenance and customer escalation support (25% focus).
This role offers the opportunity to work on the "nervous system" of the Cloudera Data Platform. You will gain deep experience in how large-scale enterprise data clouds are monitored and protected, working on a team that values technical excellence and collaborative problem-solving.
As a Senior Software Engineer, you will:
- Work on large-scale, distributed clusters to build and extend the core infrastructure required to collect and aggregate high-cardinality metrics across all Cloudera services.
- Implement and refine instrumentation libraries and collector configurations (OTel) to standardize telemetry data across the CDP stack
- Research and integrate AI/ML tools to automate management tasks, such as intelligent metric collection tuning, anomaly detection, and predictive scaling of telemetry pipelines.
- Write design documentation for key features and capabilities
- Improve code quality through writing tests, automation, and code reviews
- Own small projects maintaining the existing Java codebase for Data Lake Backup and Restore, and providing essential bug fixes, security patches, and minor enhancements to keep the BDR framework robust and enterprise-ready.
- Bachelor’s or Masters Degree in Computer Science or equivalent, and 5+ years of software development experience
- Expert proficiency in Java and/or Go
- Experience working observability or other metrics or streaming infrastructure
- Familiarity with cloud storage primitives (AWS S3, Azure ABFS, or GCS).
- Experience navigating complex distributed systems to resolve high-pressure customer situations.
- Experience with Kubernetes and containers
- Strong oral and written communication skills in English
- A team-first mindset with the ability to take a high-level design and run with the implementation to completion.
- Practical experience working with Prometheus, Grafana, and the OpenTelemetry (OTel) ecosystem.
- Experience with Python, Bash, SQL, PromQL
- Knowledge and experience with AI/ML
- Recognized contribution to open source projects
- Collaborative Execution: A team-first mindset with the ability to take a high-level design and run with the implementation to completion.
- Experience or a strong interest in applying Machine Learning to operational data (e.g., using AIOps for log pattern recognition or metric threshold tuning).
- Generous PTO Policy
- Support work life balance with Unplugged Days
- Flexible WFH Policy
- Mental & Physical Wellness programs
- Phone and Internet Reimbursement program
- Access to Continued Career Development
- Comprehensive Benefits and Competitive Packages
- Paid Volunteer Time
- Employee Resource Groups
EEO/VEVRAA