Overview
Req number: R5758
Employment type: Full time
Worksite flexibility: Remote Who we are
CAI is a global technology services firm with over 8,500 associates worldwide and a yearly revenue of $1 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary We are seeking a highly skilled and experienced Data Architect with a strong background in Big Data technologies, Databricks solutioning, and SAP integration within the manufacturing industry. The ideal candidate will have a proven track record of leading data teams, architect scalable data platforms, and optimizing cloud infrastructure costs. This role requires deep hands-on expertise in Apache Spark, Python, SQL, and cloud platforms (Azure/AWS/GCP). This is a Full-time and Remote position.
Job Description
What You’ll Do
Design and implement scalable, secure, and high-performance Big Data architectures using Databricks, Apache Spark, and cloud-native services.
Lead the end-to-end data architecture lifecycle, from requirements gathering to deployment and optimization.
Design repeatable and reusable data ingestion pipelines for bringing in data from ERP source systems like SAP, Salesforce, HR, Factory, Marketing systems etc.
Collaborate with cross-functional teams to integrate SAP data sources into modern data platforms.
Drive cloud cost optimization strategies and ensure efficient resource utilization.
Provide technical leadership and mentorship to a team of data engineers and developers.
Develop and enforce data governance, data quality, and security standards.
Translate complex business requirements into technical solutions and data models.
Stay current with emerging technologies and industry trends in data architecture and analytics.
What You'll Need
6+ years of experience in Big Data architecture, Data Engineering and AI assisted BI solutions within Databricks and AWS technologies.
3+ Years of Experience with AWS Data services like S3, Glue, Lake Formation, EMR, Kinesis, RDS, DMS and others
3+ Years of Experience in building Delta Lakes and open formats using technologies like Databricks and AWS Analytics Services.
Bachelor’s degree in computer science, information technology, data science, data analytics or related field
Proven expertise in Databricks, Apache Spark, Delta Lake, and MLflow.
Strong programming skills in Python, SQL, and PySpark.
Experience with SAP data extraction and integration (e.g., SAP BW, S/4HANA, BODS).
Hands-on experience with cloud platforms (Azure, AWS, or GCP), especially in cost optimization and data lakehouse architectures.
Solid understanding of data modeling, ETL/ELT pipelines, and data warehousing.
Demonstrated team leadership and project management capabilities.
Excellent communication, problem solving and stakeholder management skills.
Experience in the manufacturing domain, with knowledge of production, supply chain, and quality data.
Certifications in Databricks, cloud platforms, or data architecture.
Familiarity with CI/CD pipelines, DevOps practices, and infrastructure as code (e.g., Terraform).
Physical Demands
This role involves mostly sedentary work, with occasional movement around the office to attend meetings, etc.
Ability to perform repetitive tasks on a computer, using a mouse, keyboard, and monitor.
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to application.accommodations@cai.io or (888) 824 – 8111.