Overview
Responsibilities
Design, build and maintain large-scale batch data pipeline and related apps with a cross-functional, geographically distributed team of Data Engineers.
Implement scalable data processing solutions in public or hybrid cloud environments, applying sound software engineering principles and modern data patterns.
Collaborate with product managers, analysts, and data scientists to translate business requirements into technical solutions, ensuring data quality, reliability, and observability.
Write clean, modular, and testable code using languages such as Scala or Java, Spark with attention to maintainability and performance.
Contribute to team-owned data products that serve downstream consumers across analytics, reporting, and machine learning platforms.
Analyze data pattern via SQL and debug and resolve data quality issues, bottlenecks, and performance constraints in complex systems.
Develop visually impactful reports conveying key insights, trends, and actionable recommendations to diverse stakeholders
Proactively identify areas for technical and process improvements, contributing to the team's agile development practices and continuous delivery culture.
Leverage AI-powered coding assistants like GitHub Copilot/Claude to enhance productivity and code quality
Participate in peer reviews, knowledge-sharing, and team retrospectives to support ongoing team growth and collaboration.
Required Skills
You have built and maintained at least one data pipeline or data product in a production environment, ideally in cloud infrastructure (e.g., AWS, GCP, Azure)
You have strength in a programming language relevant to data (e.g. Scala, Java, Spark), moderate familiarity in other applicable languages.
You're experienced with SQL and working with both traditional RDBMS and distributed data storage systems.
You have exposure to technologies like Hive, Airflow, Iceberg, Colibra, Power BI, Qubole
You value testing, monitoring, and alerting as core parts of building reliable systems.
You're comfortable working in an agile, collaborative team environment, and can clearly communicate trade-offs and design decisions to technical and non-technical stakeholders.