Overview
Responsibilities:
Build and optimize batch and streaming data pipelines using Apache Beam (Dataflow)
Design and maintain BigQuery datasets using best practices in partitioning, clustering, and materialized views
Develop and manage Airflow DAGs in Cloud Composer for workflow orchestration
Implement SQL-based transformations using Dataform (or dbt)
Leverage Pub/Sub for event-driven ingestion and Cloud Storage for raw/lake layer data architecture
Drive engineering best practices across CI/CD, testing, monitoring, and pipeline observability
Partner with solution architects and product teams to translate data requirements into technical designs
Mentor junior data engineers and support knowledge-sharing across the team
Contribute to documentation, code reviews, sprint planning, and agile ceremonies
Requirements
2+ years of hands-on experience in data engineering, with at least 2 years on GCP
Proven expertise in BigQuery, Dataflow (Apache Beam), Cloud Composer (Airflow)
Strong programming skills in Python and/or Java
Experience with SQL optimization, data modeling, and pipeline orchestration
Familiarity with Git, CI/CD pipelines, and data quality monitoring frameworks
Exposure to Dataform, dbt, or similar tools for ELT workflows
Solid understanding of data architecture, schema design, and performance tuning
Excellent problem-solving and collaboration skills
Bonus Skills:
GCP Professional Data Engineer certification
Experience with Vertex AI, Cloud Functions, Dataproc, or real-time streaming architectures
Familiarity with data governance tools (e.g., Atlan, Collibra, Dataplex)
Exposure to Docker/Kubernetes, API integration, and infrastructure-as-code (Terraform)