Information Technology
Full-Time
Skan AI
Overview
Job Summary
We are seeking a skilled Data Engineer with 3 to 5 years of experience in building scalable data pipelines and solutions, with strong hands-on expertise in Databricks. The ideal candidate should be proficient in working with large-scale data processing frameworks and have a solid understanding of Delta Lake, PySpark, and cloud-based data platforms.
Key Responsibilities:
- Design, build, and maintain robust ETL/ELT pipelines using Databricks (PySpark/SQL).
- Develop and optimize data workflows and pipelines on Delta Lake and Databricks Lakehouse architecture.
- Integrate data from multiple sources, ensuring data quality, reliability, and performance.
- Collaborate with data scientists, analysts, and business stakeholders to translate requirements into scalable data solutions.
- Monitor and troubleshoot production data pipelines; ensure performance and cost optimization.
- Work with DevOps teams for CI/CD integration and automation of Databricks jobs and notebooks.
- Maintain metadata, documentation, and versioning for data pipelines and assets.
Required Skills:
- 3–4 years of experience in data engineering or big data development.
- Strong hands-on experience with Databricks (Notebook, Jobs, Workflows).
- Proficiency in PySpark, Spark SQL, and Delta Lake.
- Experience working with Azure or AWS (preferably Azure Data Lake, Blob Storage, Synapse, etc.).
- Strong SQL skills for data manipulation and analysis.
- Familiarity with Git, CI/CD pipelines, and job orchestration tools (e.g., Airflow, Databricks Workflows).
- Understanding of data modeling, data warehousing, and data governance best practices.
Required Skills:
- 3–5 years of experience in data engineering or big data development.
- Strong hands-on experience with Databricks (Notebook, Jobs, Workflows).
- Proficiency in PySpark, Spark SQL, and Delta Lake.
- Experience working with Azure or AWS (preferably Azure Data Lake, Blob Storage, Synapse, etc.).
- Strong SQL skills for data manipulation and analysis.
- Familiarity with Git, CI/CD pipelines, and job orchestration tools (e.g., Airflow, Databricks Workflows).
- Understanding of data modeling, data warehousing, and data governance best practices.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in