
Overview
Analytics
Posted on Mar 10, 2025
Minimum Required Experience : 8 years
Full Time
Skills
Description
Project Role Summary
Need an experienced Senior Data Engineer (8+ years) to design, develop, and optimize data pipelines and storage layers in a Medallion Architecture on Microsoft Azure. The ideal candidate will work on building scalable ETL/ELT pipelines, ensuring data governance, security, and performance optimization for batch and real-time data processing.
This role requires expertise in Azure Data Factory (ADF), Azure Databricks, Delta Lake, Microsoft Purview, Unity Catalog, Azure Databricks, ADLS Gen 2 and Python/PySpark to transform raw API-based data into curated and structured data marts.
Key Responsibilities
- Design, document and implement scalable ETL/ELT pipelines to process data from APIs into Bronze, Silver and Gold layers.
- Design, document , develop and optimize batch & streaming data pipelines using Azure Data Factory, Azure Databricks and Azure Event Hubs.
- Implement incremental loading strategies (CDC, Delta Lake merge/upsert) to efficiently manage historical and real-time data.
- Optimize Azure Synapse Analytics queries for analytical performance.
- Design efficient storage solutions leveraging Azure Data Lake Storage Gen2 & Delta Lake.
- Build and maintain dimensional models (Star Schema, Snowflake Schema) in Gold Layer for analytical and reporting use cases.
- Develop fact and dimension tables, ensuring referential integrity, indexing and partitioning.
- Implement data validation, schema enforcement, and quality checks across Bronze, Silver and Gold layers
- Ensure compliance with data governance frameworks using Microsoft Purview
- Implement Role-Based Access Control (RBAC), encryption and data masking for secure data handling.
- Optimize ETL pipelines, queries, Databricks clusters, etc. for cost efficiency.
- Implement Azure Monitor & Log Analytics for real-time data pipeline monitoring.
- Fine-tune partitioning, caching and indexing strategies for high-performance analytics.
- Work closely with Data Architects, Analysts, BI Developers and DevOps teams to ensure smooth data integration.
- Establish CI/CD pipelines for data engineering (Azure DevOps, GitHub Actions).
- Document data pipelines, models, and transformations in a structured data dictionary.
Technical Skills
- Strong experience with Azure Data Factory (ADF) and orchestration of ETL pipelines
- Strong experience with Azure Databricks , PySpark, Python
- Strong experience with Delta Lake with experience in optimized storage, versioning, and ACID transactions
- Strong experience with SQL-based analytical processing
- Strong experience with Writing and optimizing ETL/ELT workflows
- Strong experience with streaming frameworks (Azure Stream Analytics, Event Hubs)
- Strong experience with Dimensional modeling (Star Schema, Snowflake Schema)
- Strong experience with Data partitioning, indexing, and query performance tuning
- Strong experience with Microsoft Purview, Unity Catalog for data lineage & metadata management
- Strong experience with RBAC, data masking, encryption to protect sensitive information
- Strong experience with Cost-efficient Databricks cluster management
- Strong experience with CI/CD pipelines for data pipelines deployment using Azure DevOps
Soft Skills
- Strong analytical and problem-solving mindset
- Ability to collaborate with cross-functional teams.
- Excellent skills for documentation
- Good communication skills.
Nice to Have
- Microsoft Azure Data Engineer Associate (DP-203) certification
- Databricks Certified Data Engineer
- Infrastructure as Code (IaC) using Terraform.
Education
Bachelor’s/Master’s degree in Computer Science, Information Technology or related field
Experience
Total experience of 8+ years with at least 5+ years of hands on work experience in Data Engineering, Big Data Processing and Azure Cloud based data engineering solutions like implementing data lakes with medallion architecture