Pune, Maharashtra, India
Information Technology
Full-Time
MyCareernet
Overview
Company:Indian / Global Digital Organization
Key Skills: Pyspark, AWS, Python, SCALA, ETL
Roles and Responsibilities:
- Develop and deploy ETL and data warehousing solutions using Python libraries and Linux bash scripts on AWS EC2, with data stored in Redshift.
- Collaborate with product and analytics teams to scope business needs, design metrics, and build reports/dashboards.
- Automate and optimize existing data sets and ETL pipelines for efficiency and reliability.
- Work with multi-terabyte data sets and write complex SQL queries to support analytics.
- Design and implement ETL solutions integrating multiple data sources using Pentaho.
- Utilize Linux/Unix scripting for data processing tasks.
- Leverage AWS services (Redshift, S3, EC2) for storage, processing, and pipeline automation.
- Follow software engineering best practices for coding standards, code reviews, source control, testing, and operations.
Skills Required:
Must-Have:
- Hands-on experience with PySpark for big data processing
- Strong knowledge of AWS services (Redshift, S3, EC2)
- Proficiency in Python for data processing and automation
- Strong SQL skills for working with RDBMS and multi-terabyte data sets
Nice-to-Have:
- Experience with SCALA for distributed data processing
- Knowledge of ETL tools such as Pentaho
- Familiarity with Linux/Unix scripting for data operations
- Exposure to data modeling, pipelines, and visualization
Education: Bachelor's degree in Computer Science, Information Technology, or a related field
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in