Overview
JDOverview We are seeking a highly skilled and experienced Senior Data Engineer (L4) to join the program in supporting critical customer engagement. We are looking for a candidate to design and build scalable data pipelines on GCP, responsible for replacing end-of-life legacy ETL tools with modern cloud frameworks and building core data models.
This role is ideal for a self-starter who possesses strong analytical problem-solving skills, specifically the ability to reverse-engineer undocumented legacy code and map it to modern logic without losing business context. The ideal candidate adapts quickly between different technologies (Java, SQL, Python) and maintains rigorous attention to detail to ensure zero-discrepancy migrations.
The successful candidate will work closely with the Program Manager and customer stakeholders to ensure project success from initial definition through final delivery.
Required Credentials
5 - 7 Years of Experience
Required Qualifications
GCP Data Stack: Expertise in BigQuery, Cloud Composer, Cloud SQL, and Dataplex.
Transformation: Proficiency in Dataform, Advanced SQL, and Python.
Legacy Migration: Ability to read and interpret Oozie, Pig, Hive, and proprietary ETL configurations.
DevOps: Experience with GitHub Actions, Docker, and basic Terraform knowledge.
Useful Qualifications
Experience with large-scale historical data migrations from on-prem Oracle/MySQL sources.
Familiarity with "Silver and Gold" data modeling layers.
Contractor Scope of Work Delivery Expectations: Below covers the scope of work we anticipate the contractor supporting throughout the project timeline.
Scope And Solution Expectations
Pipeline Development: Design and implement GCP-based ETL/ELT pipelines using Cloud Composer, BigQuery, and Dataform to replace legacy Hadoop-stack jobs.
Data Modeling: Build core data models (Silver and Gold Layers) in BigQuery for downstream analytics and reporting.
Governance Implementation: Ensure data quality and governance by implementing Dataplex, Data Validation Tools (DVT), and automated quality checks.
Migration Execution: Perform large-scale historical data migrations from on-prem Oracle/MySQL to BigQuery and Cloud SQL.
CI/CD: Develop and implement CI/CD pipelines for all data transformation jobs using GitHub Actions.