Gurugram, Haryana, India
Information Technology
Full-Time
UST
Overview
Role Description
Key Responsibilities:
Gcp,Pyspark,Airflow
Key Responsibilities:
- Design, develop, and automate scalable data processing workflows using Apache Airflow, PySpark, and Dataproc on Google Cloud Platform (GCP).
- Build and maintain robust ETL pipelines to handle structured and unstructured data from multiple sources and formats.
- Manage and provision GCP resources including Dataproc clusters, serverless batches, Vertex AI instances, GCS buckets, and custom images.
- Provide platform and pipeline support for analytics and product teams, resolving issues related to Spark, BigQuery, Airflow DAGs, and serverless workflows.
- Collaborate with data scientists, data analysts, and other stakeholders to understand data requirements and deliver reliable solutions.
- Deliver prompt and effective technical support to internal users for data-related queries and challenges.
- Optimize and fine-tune data systems for performance, cost-efficiency, and reliability.
- Conduct root cause analysis for recurring pipeline/platform issues and work with cross-functional teams to implement long-term solutions.
- Strong programming expertise in Python and SQL
- Deep hands-on experience with Apache Airflow (including Astronomer)
- Strong experience with PySpark, SparkSQL, and Dataproc
- Proven knowledge and implementation experience on GCP data services:
- BigQuery, Vertex AI, Pub/Sub, Cloud Functions, GCS
- Strong troubleshooting skills related to data pipelines, Spark job failures, and cloud data environments
- Familiarity with data modeling, ETL best practices, and distributed systems
- Ability to support and optimize large-scale batch and streaming data processes
- Experience with SQL dialects like HiveQL, PL/SQL, and SparkSQL
- Exposure to serverless data processing and ML model deployment workflows (using Vertex AI)
- Familiarity with Terraform or Infrastructure-as-Code (IaC) for provisioning GCP resources
- Knowledge of data governance, monitoring, and cost control best practices on GCP
- Previous experience in healthcare, retail, or BFSI domains involving large-scale data platforms
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field
- Certifications in GCP Data Engineer, GCP Professional Cloud Architect, or Apache Spark are a plus
Gcp,Pyspark,Airflow
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in