Overview
You will be responsible for the end-to-end quality strategy of AI-driven products. This includes designing sophisticated test plans that account for model drift and bias, conducting deep-dive functional and regression testing, and serving as the primary gatekeeper for Responsible AI and Data Protection compliance.
Key Responsibilities
1. Test Strategy & Execution
Design Comprehensive Test Plans: Create strategies that cover the entire AI lifecycle, from data ingestion and model training to production inference.
Functional & Regression Testing: Validate AI features against business requirements. Implement automated regression suites to ensure that model updates (retraining) do not degrade performance on "golden datasets."
Evaluation Frameworks: Build scoring mechanisms for non-deterministic outputs using metrics like RECALL, Precision, F1-Score, and semantic similarity (e.g., ROUGE, BLEU).
2. Responsible AI & Ethics
Bias & Fairness Testing: Design and execute tests to identify demographic, cultural, or algorithmic biases in model outputs.
Adversarial Testing (Red Teaming): Perform "jailbreak" and prompt-injection testing to ensure safety guardrails cannot be bypassed.
Explainability: Validate that model decisions are interpretable and meet transparency standards for stakeholders and regulators.
3. Data Protection & Compliance
Data Quality Assurance: Audit training and testing datasets for completeness, consistency, and the presence of PII (Personally Identifiable Information).
Regulatory Compliance: Ensure all AI workflows adhere to global standards such as GDPR, the EU AI Act, and industry-specific privacy laws.
Data Security: Collaborate with security teams to prevent data poisoning and unauthorized access to model weights or sensitive training inputs.
Technical Skills
- Programming: Mastery of Python (specifically libraries like pytest, pandas, and numpy).
- AI/ML Frameworks: Familiarity with PyTorch, TensorFlow, or Hugging Face Transformers.
- Testing Tools: Experience with AI-specific tools like Evidently AI (monitoring), Great Expectations (data validation), or Giskard (AI testing).
- Infrastructure: Strong knowledge of CI/CD pipelines (GitHub Actions, Jenkins) and cloud platforms (AWS, Azure, or GCP).
Experience
- 4+ years in Software Quality Assurance, with at least 2 years specifically focused on AI/ML systems.
- Proven track record of implementing automated testing for RESTful APIs and integrated AI components.
- Experience in Risk-Based Testing and managing the defect lifecycle in an Agile environment.