Hyderabad, Telangana, India
Information Technology
Full-Time
UST
Overview
Role Description
Key Accountabilities / Responsibilities:
Pyspark, SQL, Azure Databricks, AWS
Key Accountabilities / Responsibilities:
- Provide technical direction and leadership to data engineers on data platform initiatives - ensuring adherence to best practices in data modelling, End to End pipeline design, and code quality.
- Review and optimize PySpark, SQL, and Databricks code for performance, scalability, and maintainability.
- Offer engineering support and mentorship to data engineering teams within delivery squads, guiding them in building robust, reusable, and secure data solutions.
- Collaborate with architects to define data ingestion, transformation, and storage strategies leveraging Azure services such as Azure Data Factory, Azure Databricks, Azure Data Lakes
- Drive automation and CI/CD practices in data pipelines using tools such as Git, Azure DevOps, and DBT (good to have).
- Ensure optimal data quality and lineage by implementing proper testing, validation, and monitoring mechanisms within data pipelines.
- Stay current with evolving data technologies, tools, and best practices, continuously improving standards, frameworks, and engineering methodologies.
- Troubleshoot complex data issues, analyse system performance, and provide solutions to development and service challenges.
- Coach, mentor, and support team members through knowledge sharing sessions, technical reviews.
- Developer / engineering background of large-scale distributed data processing systems (or experience in equal measure). Can provide constructive feedback based on knowledge.
- Proficient in designing scalable and efficient data models tailored for analytical and operational workloads, ensuring data integrity and optimal query performance.
- Practical experience implementing and managing Unity Catalog for centralized governance of data assets across Databricks workspaces, including access control, lineage tracking, and auditing.
- Demonstrated ability to optimize data pipelines and queries using techniques such as partitioning, caching, indexing, and adaptive execution strategies to improve performance and reduce costs.
- Programming in Pyspark (Must), SQL(Must), Python (Good to have).
- Experience with Databricks (Mandatory) and DBT (Good to have) is required
- Implemented cloud data technologies on either Azure (Must) other optional GCP, Azure or AWS.
- Knowledge around shortening development lead time and improving data development lifecycle
- Worked in an Agile delivery framework.
Pyspark, SQL, Azure Databricks, AWS
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in