Overview
Key ResponsibilitiesDesign, develop, and maintain end-to-end data pipelines using Azure Data Factory (ADF), Databricks, Synapse Analytics, and related Azure services.
Implement ETL/ELT processes to integrate data from multiple structured and unstructured data sources.
Build and optimize data lake and data warehouse solutions in Azure Data Lake Storage (ADLS) and Azure Synapse.
Collaborate with data scientists, analysts, and business stakeholders to deliver scalable data solutions.
Develop CI/CD pipelines for data workflows using Azure DevOps or equivalent tools.
Implement data governance, quality, and security best practices following enterprise standards.
Monitor, troubleshoot, and optimize data solutions for performance and cost efficiency.
Mentor junior data engineers and provide technical leadership in data solution design.
Azure Services: Azure Data Factory (ADF), Azure Databricks, Azure Synapse Analytics, Azure Data Lake Storage (ADLS Gen2), Azure SQL Database.
Programming & Scripting: SQL, Python, PySpark, or Scala.
Data Modeling: Dimensional modeling, star schema, snowflake schema.
Version Control & CI/CD: Git, Azure DevOps, or GitHub Actions.
Workflow Orchestration: ADF pipelines, Databricks jobs.
Big Data & Distributed Processing: Spark, Delta Lake.
Data Integration Tools: ETL/ELT design, API integration.
Performance Optimization: Query tuning, partitioning, caching strategies.