Overview
What you’ll do:
• Work with enterprise/application/solution architects, product architects, product owners, data scientists and engineers to bring big data and data science R&D projects into Production.
• Deliver “hands-on” design expertise in support of the Purchasing Power Environment. Looking for Data Architects/Engineers to build, optimize and maintain conceptual and logical database models.
• Develops and maintains scalable data pipelines and builds out integrations to support continuing increases in data volume and complexity.
• Demonstrate technical expertise in enterprise level cloud data architecture and advance analytic solutions
• Understanding of conceptual, logical, and physical architectures for cloud-based data solutions – AWS preferred.
• Develop end-to-end cloud data solutions including architecture, infrastructure, storage, data model, ETL/ELT, and consumption
• Lead the technical design and solution of migrating on-premise legacy data solutions to cloudbased data platforms. Again, AWS is our preferred cloud infrastructure and would prefer somebody with extensive AWS experience in EC2 and Redshift.
• Define best practices & frameworks for capabilities across data and analytics landscape.
• Designs and evaluates open source and vendor tools for data lineage
• Works closely with all business unites and engineering teams to develop strategy for long term data platform architecture.
• Provide visionary leadership for projects related to Big Data technologies and software development support for client research projects
• Establishes and participates in data management processes including data lineage, data profiling, data quality management, data stewardship and governance.
• Document the architecture and architectural decisions related to the assigned application portfolio.
• Develop highly scalable and extensible data platforms/warehouses which enable collection,
storage, modeling, and analysis of massive data sets
Requirements:
• Experience with Java/J2EE technologies and other web technologies.
• Experience working with a data pipeline in the cloud (AWS highly preferred).
• Scala, Kafka, Spark, Zookeeper, Kafka Connect, Oracle, Dremio, Redshift, Elastic Search and other open source technologies are a requirement.
• Data Visualization: any of Tableau, PowerBI, QlikView, Domo
• Data Ingestion framework: Confluent platform including Kafka, Kafka connect, Schema Registry
• Database Platforms (DBaaS): Snowflake, RedShift, Azure SQL DW, Azure SQL DB, Big Query
• At least 5-7 years of dimensional data modeling
• Strong Scripting skills in Python and shell scripting.
• In-depth understanding of database structure principles
• Familiarity with data visualization tools
• Strong interpersonal and management skills.
• Excellent verbal and written communication skills.
• Good understanding of SDLC process.
• Experience working in public cloud environments, i.e. AWS
• Enterprise Database: Oracle, DB2, SQL Server, Mongo DB, NoSQL, Dynamo DB
• Retail industry knowledge and experience required
• Source control knowledge: GitHub
• Agile / Scrum experience preferred
• To be able to work in a fast-paced agile development environment