Mumbai, Maharashtra, India
Information Technology
Full-Time
Clupen Business Solutions LLP
Overview
DescriptionJob Title : Data Engineer
Location : Candidate should be based out of Kerala and willing to work Remote
Shift Timing : 2 : 00 PM - 11 : 00 PM (UK Shift)
Experience : 3.5+ Years
Notice Period : Immediate Joiners to max 20 days notice
Job Summary
Client looking for a skilled Data Engineer with strong experience in Python, PySpark, and cloud-based data engineering tools to support the development of scalable ETL pipelines and data processing systems. The ideal candidate will have hands-on exposure to pipeline orchestration, cloud storage, and distributed data environments while working collaboratively in a remote setup.
Key Responsibilities
- Build and maintain ETL pipelines for processing structured and semi-structured data.
- Assist in data pipeline orchestration using Airflow and support DAG development.
- Work with AWS S3 and AWS Glue to load, transform, and manage datasets in cloud environments.
- Support container-based packaging using Docker and contribute to deployments in Kubernetes-managed environments.
- Collaborate with cross-functional teams using Git, Teams, and project management tools.
- Maintain documentation, pipeline logs, and workflow tracking using Jira and Confluence.
- Participate in quality assurance, compliance activities, and continuous learning initiatives as required.
- Bachelors degree in Computer Science, Data Engineering, Data Science, Statistics, or a related field.
- Minimum 3.5+ years of experience in data engineering or related roles.
- Python & PySpark : Minimum 2 years of experience in building data processing pipelines.
- Apache Airflow : Minimum 1 year experience supporting DAG development and orchestration workflows.
- AWS S3 & AWS Glue : At least 6 months of experience working with cloud data storage and transformations.
- Git & Jupyter : Minimum 1 year experience using Git for version control and notebooks for prototyping.
- Docker / Kubernetes : At least 6 months exposure to containerization and orchestration environments.
- macOS-based work environment
- JupyterLab / Python IDEs
- GitHub for collaboration
- Microsoft Teams & Outlook for communication
- EMR Studio for distributed data processing exposure
- Jira & Confluence for documentation and workflow tracking
(ref:hirist.tech)
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in