300 - 500 Indian Rupee - Hourly
Noida, Uttar Pradesh, India
Information Technology
Full-Time
Sagebridge pvt ltd
Overview
Key Responsibilities
- Pipeline Development: Design, implement, and maintain scalable ETL/ELT pipelines using PySpark on Databricks.
- Data Optimization: Tune Spark jobs for performance (e.g., partitioning, caching) and reduce costs in Databricks environments.
- Lakehouse Architecture: Build and manage Delta Lake tables for ACID compliance, schema evolution, and time-travel queries.
- Data Governance: Implement data quality checks (e.g., Great Expectations), metadata management, and lineage tracking.
- Collaboration: Partner with analytics teams to provision clean datasets for BI tools (Tableau/Power BI) and ML models.
- Cloud Integration: Deploy pipelines on AWS/Azure/GCP via Databricks workflows, integrating with services like S3, Redshift, or Snowflake.
- Automation: Develop CI/CD pipelines for Databricks notebooks/jobs (e.g., using Git, Jenkins, or Databricks CLI).
Required Qualifications
- Experience: 5+ years in data engineering with 3+ years focused on PySpark and Databricks.
- Technical Skills:
- Expert-level proficiency in PySpark (DataFrames, Spark SQL, RDDs).
- Hands-on experience with Databricks (Workflows, Delta Lake, Unity Catalog, Cluster Optimization).
- Advanced SQL and data modeling (star/snowflake schemas).
- Python scripting and libraries (Pandas, NumPy).
- Cloud Platforms: Production experience with AWS, Azure, or GCP.
- Data Warehousing: Knowledge of modern stacks (e.g., Delta Lake + Snowflake/Redshift).
- Version Control: Proficient with Git/GitHub.
Preferred Qualifications
- Certifications: Databricks Certified Data Engineer Associate/Professional, AWS/Azure Data Engineer.
- Streaming: Experience with Spark Structured Streaming or Kafka.
- ML Integration: Familiarity with MLOps for deploying ML models in Databricks (MLflow).
- Infrastructure as Code (IaC): Terraform/CloudFormation for managing Databricks resources.
- Agile: Experience in Scrum/Kanban environments.
Job Types: Full-time, Contractual / Temporary
Contract length: 3 months
Pay: ₹300.00 - ₹500.00 per hour
Expected hours: 20 – 40 per week
Benefits:
- Work from home
Schedule:
- Monday to Friday
Experience:
- pyspark: 5 years (Required)
- Delta lake: 2 years (Preferred)
- Data Streaming: 1 year (Preferred)
- Data modeling: 1 year (Required)
- Data Pipeline: 1 year (Required)
Work Location: Remote
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in