Guwahati, Assam, India
Information Technology
Full-Time
Sigmoid
Overview
We are looking for a skilled Data Engineer with 6+ years of experience in big data technologies, particularly Python, PYSpark, SQL, and data lakehouse architectures. The ideal candidate will have a strong background in building scalable data pipelines and experience with modern data storage formats, including Apache Iceberg. You will work closely with cross-functional teams to design and implement efficient data solutions in a cloud-based environment.
Data Pipeline Development
The core responsibilities for the job include the following:
- Design, build, and optimize scalable data pipelines using Apache Spark.
- Implement and manage large-scale data processing solutions across data lakehouses.
- Work with modern data lakehouse platforms (e. g. Apache Iceberg) to handle large datasets.
- Optimize data storage, partitioning, and versioning to ensure efficient access and querying.
- Write complex SQL queries to extract, manipulate, and transform data.
- Develop performance-optimized queries for analytical and reporting purposes.
- Integrate various structured and unstructured data sources into the lakehouse environment.
- Work with stakeholders to define data needs and ensure data is available for downstream consumption.
- Implement data quality checks and ensure the reliability and accuracy of data.
- Contribute to metadata management and data cataloging efforts.
- Monitor and optimize the performance of Spark jobs, SQL queries, and overall data infrastructure.
- Work with cloud infrastructure teams to optimize costs and scale as needed.
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- 8+ years of experience in data engineering, with a focus on Java/Python, Spark, and SQL Programming languages.
- Hands-on experience with Apache Iceberg, Snowflake, or similar technologies.
- Strong understanding of data lakehouse architectures and data warehousing principles.
- Proficiency in AWS data services.
- Experience with version control systems like Git and CI/CD pipelines.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
- Experience with containerization (Docker, Kubernetes) and orchestration tools like Airflow.
- Certifications in AWS cloud technologies.
This job was posted by Ankita Swain from Sigmoid.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in