Overview
he AI Architect requires 7-plus years of quality hands-on experience in AI development, focusing on bringing systems live with over 100 users, not just proof of concepts.
AI Data Architect
About the Role
As an AI Data Architect you will be a key player in shaping the foundation for our cutting-edge Artificial Intelligence and Machine Learning solutions. You will specialize in designing, building, and optimizing scalable and robust architectures that specifically cater to the unique demands of AI/ML models – from data ingestion and preparation for training to serving data for real-time inference. This role requires a deep understanding of data warehousing, streaming technologies, the lifecycle of machine learning models and Google Cloud. You'll collaborate closely with data scientists, ML engineers, and data engineers to ensure our AI initiatives are powered by high-quality, accessible, and performant data.
Key Responsibilities
AI/ML Data Architecture Design
- Design and implement end-to-end data architectures specifically optimized for AI/ML workflows, including data ingestion, feature engineering, model training data sets, and real-time inference data serving.
- Develop architectural blueprints and roadmaps for MLOps platforms, ensuring seamless integration of data pipelines with model development, deployment, and monitoring.
- Evaluate and select appropriate data storage solutions (e.g., data lakes, feature stores, vector databases) and processing frameworks tailored for AI/ML workloads.
Data Preparation & Feature Engineering
- Define strategies and patterns for scalable data preparation and feature engineering, working closely with data scientists to transform raw data into model-ready features.
- Design and implement data pipelines to automate the creation, versioning, and management of features for ML models.
Model Operationalization Data Support
- Architect data flows to support the deployment and operationalization of ML models, including real-time data feeding for inference engines and batch scoring processes.
- Design robust monitoring and alerting mechanisms for data quality and drift affecting deployed models.
Data Governance & MLOps Integration
- Implement data governance policies and security measures (e.g., access controls, data anonymization) specific to AI/ML data, ensuring ethical AI practices and regulatory compliance.
- Collaborate with MLOps engineers to integrate data architecture with CI/CD pipelines for ML models, ensuring reproducible and scalable ML workflows.
- Establish metadata management strategies for features, models, and datasets within the AI/ML ecosystem.
Technical Consultation & Innovation
- Provide expert guidance to data scientists, ML engineers, and business stakeholders on data architecture best practices for AI/ML projects.
- Research and evaluate emerging AI/ML data technologies and frameworks, recommending innovative solutions to enhance our capabilities.
- Troubleshoot and optimize data pipelines and data access patterns for AI/ML performance and efficiency.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, AI/ML, or a related quantitative field.
- 7+ years of experience in data architecture, with at least 3 years focused on supporting AI/ML initiatives.
- Proven experience designing and implementing data architectures for machine learning pipelines, including data lakes, data warehousing, and real-time data streaming.
- Deep understanding of the ML lifecycle (MLOps) and the data requirements at each stage (data collection, feature engineering, training, inference, monitoring).
- Strong proficiency with SQL and at least one programming language essential for AI/ML data processing (e.g., Python, Scala).
- Extensive experience with cloud data platforms (preferably GCP) and their AI/ML-specific data services (e.g., Google Cloud AI Platform, AWS Sagemaker, Azure ML, Databricks, Snowflake).
- Familiarity with ML frameworks and libraries (e.g., ADK, MCP, VertexAI) from a data architecture perspective.
- Knowledge of feature stores, vector databases, and MLOps tools.
- Strong analytical and problem-solving skills, with the ability to translate complex AI/ML requirements into technical data solutions.
- Excellent communication and collaboration skills, able to bridge the gap between data science and data engineering teams.
Preferred Skills
- Experience with big data processing frameworks like Apache Spark, Flink.
- Knowledge of containerization (Docker) and orchestration (Kubernetes) in an ML context.
- Experience with data governance tools specifically for AI/ML data.
- Familiarity with responsible AI principles and their implications for data architecture.
-
Contributions to open-source projects related to AI/ML data infrastructure.
-
Certifications in relevant cloud AI/ML or data engineering specialties.