Pune, Maharashtra, India
Information Technology
Full-Time
Oneture Technologies
Overview
We are looking for a highly skilled and hands-on Senior Data Engineer to join our growing data engineering practice in Mumbai. This role requires deep technical expertise in building and managing enterprise-grade data pipelines, with a primary focus on Amazon Redshift, AWS Glue, and data orchestration using Airflow or Step Functions. You will be responsible for building scalable, high-performance data workflows that ingest and process multi-terabyte-scale data across complex, concurrent environments.
The ideal candidate is someone who thrives in solving performance bottlenecks, has led or participated in data warehouse migrations (e. g., Snowflake to Redshift), and is confident in interfacing with business stakeholders to translate requirements into robust data solutions.
Responsibilities
The ideal candidate is someone who thrives in solving performance bottlenecks, has led or participated in data warehouse migrations (e. g., Snowflake to Redshift), and is confident in interfacing with business stakeholders to translate requirements into robust data solutions.
Responsibilities
- Design, develop, and maintain high-throughput ETL/ELT pipelines using AWS Glue (PySpark), orchestrated via Apache Airflow or AWS Step Functions.
- Own and optimize large-scale Amazon Redshift clusters and manage high concurrency workloads for a very large user base.
- Lead and contribute to migration projects from Snowflake or traditional RDBMS to Redshift, ensuring minimal downtime and robust validation.
- Integrate and normalize data from heterogeneous sources, including REST APIs, AWS Aurora (MySQL/Postgres), streaming inputs, and flat files.
- Implement intelligent caching strategies, leverage EC2 and serverless compute (Lambda, Glue) for custom transformations and processing at scale.
- Write advanced SQL for analytics, data reconciliation, and validation, demonstrating strong SQL development and tuning experience.
- Implement comprehensive monitoring, alerting, and logging for all data pipelines to ensure reliability, availability, and cost optimization.
- Collaborate directly with product managers, analysts, and client-facing teams to gather requirements and deliver insights-ready datasets.
- Champion data governance, security, and lineage, ensuring data is auditable and well-documented across all environments.
- 1-3 years of core data engineering experience, especially focused on Amazon Redshift hands-on performance tuning and large-scale management capacity.
- Demonstrated experience handling multi-terabyte Redshift clusters, concurrent query loads, and managing complex workload segmentation and queue priorities.
- Strong experience with AWS Glue (PySpark) for large-scale ETL jobs.
- Solid understanding and implementation experience of workflow orchestration using Apache Airflow or AWS Step Functions.
- Strong proficiency in Python, advanced SQL, and data modeling concepts.
- Familiarity with CI/CD pipelines, Git, DevOps processes, and infrastructure-as-code concepts.
- Experience with Amazon Athena, Lake Formation, or S3-based data lakes.
- Hands-on participation in Snowflake, BigQuery, or Teradata migration projects.
- Exposure to real-time streaming architectures or Lambda architectures.
- AWS Certified Data Analytics - Specialty.
- AWS Certified Solutions Architect - Associate/Professional.
- Excellent communication skills enable able to confidently engage with both technical and non-technical stakeholders, including clients.
- Strong problem-solving mindset and a keen attention to performance, scalability, and reliability.
- Demonstrated ability to work independently, lead tasks, and take ownership of large-scale systems.
- Comfortable working in a fast-paced, dynamic, and client-facing environment.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in