Hyderabad, Telangana, India
Information Technology
Full-Time
Sedin Technologies
Overview
Databricks Data Engineer
Key Responsibilities :
Key Responsibilities :
- Design, develop, and maintain high-performance data pipelines using Databricks and Apache Spark.
- Implement medallion architecture (Bronze, Silver, Gold layers) for efficient data processing.
- Optimize Delta Lake tables, partitioning, Z-ordering, and performance tuning in Databricks.
- Develop ETL/ELT processes using PySpark, SQL, and Databricks Workflows.
- Manage Databricks clusters, jobs, and notebooks for batch and real-time data processing.
- Work with Azure Data Lake, AWS S3, or GCP Cloud Storage for data ingestion and storage.
- Implement CI/CD pipelines for Databricks jobs and notebooks using DevOps tools.
- Monitor and troubleshoot performance bottlenecks, cluster optimization, and cost management.
- Ensure data quality, governance, and security using Unity Catalog, ACLs, and encryption.
- Collaborate with Data Scientists, Analysts, and Business Teams to deliver Skills & Experience :
- 5+ years of hands-on experience in Databricks, Apache Spark, and Delta Lake.
- Strong SQL, PySpark, and Python programming skills.
- Experience in Azure Data Factory (ADF), AWS Glue, or GCP Dataflow.
- Expertise in performance tuning, indexing, caching, and parallel processing.
- Hands-on experience with Lakehouse architecture and Databricks SQL.
- Strong understanding of data governance, lineage, and cataloging (e.g., Unity Catalog).
- Experience with CI/CD pipelines (Azure DevOps, GitHub Actions, or Jenkins).
- Familiarity with Airflow, Databricks Workflows, or orchestration tools.
- Strong problem-solving skills with experience in troubleshooting Spark jobs.
- Hands-on experience with Kafka, Event Hubs, or real-time streaming in Databricks.
- Certifications in Databricks, Azure, AWS, or GCP.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in