Overview
Flexing It is a freelance consulting marketplace that connects freelancers and independent consultants with organisations seeking independent talent.
Flexing It has partnered with Our client, a global leader in energy management and automation, is seeking a Data engineer to prepare data and make it available in an efficient and optimized format for their different data consumers, ranging from BI and analytics to data science applications. It requires to work with current technologies in particular Apache Spark, Lambda & Step Functions, Glue Data Catalog, and RedShift on AWS environment.
Key Responsibilities:
- Design and develop new data ingestion patterns into IntelDS Raw and/or Unified data layers based on the requirements and needs for connecting new data sources or for building new data objects. Working in ingestion patterns allow to automate the data pipelines.
- Participate to and apply DevSecOps practices by automating the integration and delivery of data pipelines in a cloud environment. This can include the design and implementation of end-to-end data integration tests and/or CICD pipelines.
- Analyze existing data models, identify and implement performance optimizations for data
ingestion and data consumption. The objective is to accelerate data availability within the
platform and to consumer applications.
- Support client applications in connecting and consuming data from the platform, and ensure they follow our guidelines and best practices.
- Participate in the monitoring of the platform and debugging of detected issues and bugs
Skills required:
- Minimum of 3 years prior experience as data engineer with proven experience on Big Data and Data Lakes on a cloud environment.
- Bachelor or Master degree in computer science or applied mathematics (or equivalent)
- Proven experience working with data pipelines / ETL / BI regardless of the technology.
- Proven experience working with AWS including at least 3 of: RedShift, S3, EMR, Cloud
Formation, DynamoDB, RDS, lambda.
- Big Data technologies and distributed systems: one of Spark, Presto or Hive.
- Python language: scripting and object oriented.
- Fluency in SQL for data warehousing (RedShift in particular is a plus).
- Good understanding on data warehousing and Data modelling concepts
- Familiar with GIT, Linux, CI/CD pipelines is a plus.
- Strong systems/process orientation with demonstrated analytical thinking, organization
skills and problem-solving skills.
- Ability to self-manage, prioritize and execute tasks in a demanding environment.
- Strong consultancy orientation and experience, with the ability to form collaborative,
productive working relationships across diverse teams and cultures is a must.
- Willingness and ability to train and teach others.
- Ability to facilitate meetings and follow up with resulting action items