Overview
Job SummaryThe Data Engineer plays a critical role in designing, building, and maintaining high-quality data pipelines on the Databricks platform. This position centers on the ingestion, transformation, and processing of large-scale datasets to derive meaningful business insights. Success in the role requires writing efficient code, primarily in Python and SQL, while leveraging tools such as PySpark and Delta Lake. Integration with cloud services, especially Azure, is a key component. The Data Engineer is expected to uphold the highest standards of data quality, security, and performance, and to work collaboratively with both technical and non-technical stakeholders, translating business requirements into actionable, data-driven solutions.
Qualification & Skills
Experience : 1 - 5 years
- Proficiency in data engineering principles, including the development and maintenance of data pipelines.
- Advanced coding skills in Python, SQL, and Scala, with significant experience working with Apache Spark.
- Hands-on experience with the Databricks platform, particularly with Delta Lake, Databricks Runtime, and Databricks Workflows.
- Familiarity with the Azure Cloud platform.
- Knowledge of the Gold Medallion architecture.
- Experience in data ingestion, transformation, and loading processes (ETL/ELT).
- Excellent communication skills, with the ability to explain complex data concepts to both technical and non-technical audiences.
- Strong problem-solving and analytical abilities.
- Experience with Mulesoft API platform is considered an asset.
- Background in creating ingestion pipelines from a variety of systems, such as HRIS, ERP, CRM, Microsoft SQL Server, and Apache Kafka.
- Experience with machine learning and data analytics.
- Knowledge of data governance and security best practices.
- Databricks certifications an asset
- Knowledge or experience in developing and integrating custom machine learning models using Azure Machine Learning, MLflow, and other relevant libraries
Pipeline Development:
- Design, build, and maintain scalable data pipelines for both batch and streaming data, sourced from a variety of systems. The primary technologies used in these processes are PySpark and Databricks SQL.
Data Management:
- Maintain high standards of data quality, integrity, and security throughout every stage of the data lifecycle.
Cost Management:
- Track, monitor and report on platform compute costs and escalate any unexpected anomalies
Platform Optimization:
- Tune and optimize Databricks jobs and Spark configurations to enhance both performance and cost efficiency.
Cloud Integration:
- Integrate Databricks with other cloud services for storage, compute, and security, with a particular focus on Azure Data Lake Storage.
Collaboration:
- Work in close partnership with cross-functional teams—including data scientists, analysts, and business stakeholders—to understand requirements and deliver data-driven solutions tailored to their needs.
Monitoring and Support:
- Monitor the performance of data pipelines, troubleshoot issues as they arise, and provide support for user requests within the Databricks environment.
Best Practices:
Implement and enforce best practices for data governance, security, and compliance in all aspects of data engineering activities.
About ATS
ATS Corporation is an industry-leading automation solutions and technology provider to many of the world's most successful companies. Using extensive knowledge and global capabilities in custom and repeat automation, automation products and value-added solutions including pre-automation and after-sales services, ATS businesses address the sophisticated manufacturing automation and service needs of multinational customers in markets such as life sciences, transportation, food & beverage, consumer products, and energy. With a dynamic culture that is bolstered by driven employees and the ATS Business Model (ABM), ATS companies are united by a shared purpose of creating solutions that positively impact lives around the world. Founded in 1978, ATS employs over 7,000 people at more than 65 manufacturing facilities and over 85 offices in North America, Europe, Southeast Asia and Oceania. The Company's common shares are traded on the Toronto Stock Exchange and the NYSE under the symbol ‘ATS’.