Information Technology
Full-Time
PepsiCo
Overview
Overview
As a data engineering lead, you will be the key technical expert overseeing PepsiCo's data product build & operations and drive a strong vision for how data engineering can proactively create a positive impact on the business. You'll be empowered to create & lead a strong team of data engineers who build data pipelines into various source systems, rest data on the PepsiCo Data Lake, and enable exploration and access for analytics, visualization, machine learning, and product development efforts across the company.
Responsibilities
As a data engineering lead, you will be the key technical expert overseeing PepsiCo's data product build & operations and drive a strong vision for how data engineering can proactively create a positive impact on the business. You'll be empowered to create & lead a strong team of data engineers who build data pipelines into various source systems, rest data on the PepsiCo Data Lake, and enable exploration and access for analytics, visualization, machine learning, and product development efforts across the company.
Responsibilities
- Act as a subject matter expert across different digital projects.
- Oversee work with internal clients and external partners to structure and store data into unified taxonomies and link them together with standard identifiers.
- Manage and scale data pipelines from internal and external data sources to support new product launches and drive data quality across data products.
- Build and own the automation and monitoring frameworks that captures metrics and operational KPIs for data pipeline quality and performance.
- Responsible for implementing best practices around systems integration, security, performance, and data management.
- Empower the business by creating value through the increased adoption of data, data science and business intelligence landscape.
- Collaborate with internal clients (data science and product teams) to drive solutioning and POC discussions.
- Evolve the architectural capabilities and maturity of the data platform by engaging with enterprise architects and strategic internal and external partners.
- Develop and optimize procedures to “productionalize” data science models.
- Define and manage SLA’s for data products and processes running in production.
- Support large-scale experimentation done by data scientists.
- Prototype new approaches and build solutions at scale.
- Research in state-of-the-art methodologies.
- Create documentation for learnings and knowledge transfer.
- Create and audit reusable packages or libraries.
- 7+ years of overall technology experience that includes at least 5+ years of hands-on software development, data engineering, and systems architecture.
- 4+ years of experience with Data Lake Infrastructure, Data Warehousing, and Data Analytics tools.
- 4+ years of experience in SQL optimization and performance tuning, and development experience in programming languages like Python, PySpark, Scala etc.).
- 2+ years in cloud data engineering experience in Azure.
- Fluent with Azure cloud services. Azure Certification is a plus.
- Experience in Azure Log Analytics
- Experience with integration of multi cloud services with on-premises technologies.
- Experience with data modelling, data warehousing, and building high-volume ETL/ELT pipelines.
- Experience with data profiling and data quality tools like Apache Griffin, Deequ, and Great Expectations.
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets.
- Experience with at least one MPP database technology such as Redshift, Synapse or Snowflake.
- Experience with running and scaling applications on the cloud infrastructure and containerized services like Kubernetes.
- Experience with version control systems like Github and deployment & CI tools.
- Experience with Azure Data Factory, Azure Databricks and Azure Machine learning tools.
- Experience with Statistical/ML techniques is a plus.
- Experience with building solutions in the retail or in the supply chain space is a plus.
- Understanding of metadata management, data lineage, and data glossaries is a plus.
- Working knowledge of agile development, including DevOps and DataOps concepts.
- Familiarity with business intelligence tools (such as PowerBI).
- B Tech/BA/BS in Computer Science, Math, Physics, or other technical fields.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in