Overview
Job Description: Data AI Engineer (Data Building)
Job Summary: We are seeking an experienced Senior Data AI Engineer to lead the design, development, and optimization of large-scale data pipelines and AI-driven systems. The ideal candidate will specialize in building and maintaining robust data architectures that support AI and machine learning workflows, ensuring seamless integration of data sources and delivering high-quality, actionable insights for business decision-making.
Key Responsibilities:
Data Architecture and Engineering: 1. Design, build, and maintain scalable, high-performance data pipelines and architectures for structured and unstructured data.
2. Develop ETL/ELT processes to extract, transform, and load data from diverse sources into centralized data warehouses or lakes.
3. Optimize data storage solutions for AI/ML workflows, including real-time data streaming and batch processing systems.
4. Ensure the accuracy, consistency, and security of data across platforms.
AI/ML Workflow Integration:
1. Collaborate with data scientists to implement and deploy machine learning models into production systems.
2. Develop reusable AI pipelines and workflows that streamline feature engineering, model training, and inference.
3. Implement monitoring systems to track model performance and data quality over time.
Data Analytics and Insights:
1. Work with cross-functional teams to identify business needs and translate them into technical data solutions.
2. Design and implement data frameworks to support advanced analytics and predictive modeling.
3. Drive innovations in AI-driven decision-making through effective data utilization and tool implementation.
System Performance and Optimization:
1. Monitor and enhance the performance, reliability, and scalability of data systems.
2. Implement data validation, anomaly detection, and quality checks to ensure integrity in AI processes.
3. Troubleshoot and resolve performance issues in data pipelines and AI systems.
Leadership and Collaboration:
1. Lead and mentor junior data engineers and provide technical expertise to the team.
2. Collaborate with stakeholders, including business analysts, data scientists, and product managers, to align technical solutions with business goals.
3. Research emerging technologies and provide recommendations for continuous improvement in data and AI systems.
Qualifications:
Education:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, Artificial Intelligence, or a related field.
Certifications (preferred):
- Google Professional Data Engineer
- AWS Certified Data Analytics
- Microsoft Azure Data Engineer • Certified Artificial Intelligence Engineer (optional)
Experience:
- 5+ years of experience in data engineering, with at least 2 years in AI/ML workflows.
- Proven expertise in building and optimizing large-scale data pipelines and architectures.
Skillsets:
Technical Skills:
1. Data Engineering:
o Proficiency in ETL/ELT tools like Apache Airflow, Talend, or Informatica.
o Hands-on experience with data lakes, data warehouses (Snowflake, Redshift, BigQuery), and databases (SQL, NoSQL).
2. Programming and Scripting:
o Expertise in Python, Scala, or Java for data processing and AI integration.
o Familiarity with data query languages like SQL and Spark SQL.
3. Big Data Technologies:
o Experience with Hadoop, Apache Spark, and Kafka for large-scale data processing.
o Real-time data streaming tools like Flink or Kinesis.
4. Cloud Platforms:
o Hands-on experience with AWS, Azure, or Google Cloud for data pipelines and AI workflows.
o Familiarity with cloud-native services like AWS Glue, Azure Data Factory, or Google Dataflow.
5. AI/ML Tools:
o Knowledge of frameworks like TensorFlow, PyTorch, or Scikit-learn.
o Experience with MLOps tools such as MLflow, Kubeflow, or TFX.
o The candidate should have enough expertise in developing custom models based on the use cases using Lama and other popular models.
6. Data Modeling and Governance:
o Expertise in data modeling, schema design, and metadata management.
o Understanding of data governance principles, data lineage, and compliance standards (e.g., GDPR, HIPAA).
7. Performance Optimization:
o Proficiency in optimizing queries, pipelines, and data storage for speed and efficiency.
o Familiarity with distributed computing and parallel processing.
Soft Skills:
1. Strong problem-solving and analytical thinking abilities.
2. Excellent communication skills to convey complex technical concepts to non-technical stakeholders.
3. Leadership and mentoring skills to guide junior team members.
4. Ability to work collaboratively in a fast-paced, agile environment.
5. Strategic thinking with a focus on innovation and scalability.
Willingness to travel abroad and work on-site when required.
Why Join Us?
- Opportunity to work on cutting-edge technologies and impactful projects.
- Be part of a dynamic, innovative, and supportive team.
- Professional development opportunities.
Job Types: Full-time, Permanent
Benefits:
- Paid sick time
- Provident Fund
Schedule:
- Day shift
- Monday to Friday
Work Location: In person