Bangalore, Karnataka, India
Information Technology
Full-Time
UST
Overview
Role Description
Roles & Responsibilities:
Development & Implementation
Spark,Hadoop,Hive,Gcp
Roles & Responsibilities:
Development & Implementation
- Design, build, and maintain large-scale batch and real-time data pipelines using PySpark, Spark, Hive, and related big data tools.
- Write clean, efficient, and scalable code aligned with application design and coding standards.
- Create and maintain technical documentation including design documents, test cases, and configurations.
- Contribute to HLD, LLD, and data architecture documents.
- Review and validate designs and code from peers and junior developers.
- Lead technical discussions and decisions with cross-functional teams.
- Optimize data processing workflows for efficiency, cost, and performance.
- Manage data quality and ensure data accuracy, lineage, and governance across the pipeline.
- Collaborate with product managers, data stewards, and business stakeholders to translate data requirements into robust engineering solutions.
- Clarify requirements and propose design options to customers.
- Write and review unit tests and integration tests to ensure data integrity and performance.
- Monitor and troubleshoot data pipeline issues and ensure minimal downtime.
- Participate in sprint planning, estimation, and daily stand-ups.
- Ensure on-time delivery of user stories and bug fixes.
- Drive release planning and execution processes.
- Set FAST goals and provide timely feedback to team members.
- Mentor junior engineers, contribute to a positive team environment, and drive continuous improvement.
- Ensure adherence to compliance standards such as SOX, HIPAA, and organizational coding standards.
- Contribute to knowledge repositories, project wikis, and best practice documents.
- Minimum 6+ years of experience as a Data Engineer.
- Hands-on expertise in PySpark and SQL.
- Experience in Google Cloud Platform (GCP) or similar cloud environments (AWS, Azure).
- Proficient in Big Data technologies such as Spark, Hadoop, Hive.
- Solid understanding of ETL/ELT frameworks, data warehousing, and data modeling.
- Strong knowledge of CI/CD tools (Jenkins, Git, Ansible, etc.).
- Excellent problem-solving and analytical skills.
- Strong written and verbal communication skills.
- Experience with Agile/Scrum methodologies.
- Experience with data orchestration tools (Airflow, Control-M).
- Familiarity with modern data platforms such as Snowflake, DataRobot, Denodo.
- Experience in containerized environments (Kubernetes, Docker).
- Exposure to data security, governance, and compliance frameworks.
- Hands-on with Terraform, ARM Templates, or similar scripting tools for infrastructure automation.
- Domain knowledge in banking, healthcare, or retail industries.
Spark,Hadoop,Hive,Gcp
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in