Free cookie consent management tool by TermsFeed Data Engineer for AI | Antal Tech Jobs
Back to Jobs
5 Days ago

Data Engineer for AI

decor
Mumbai, Maharashtra, India
Information Technology
Full-Time
Cloudera

Overview

Business Area:

Professional Services

Seniority Level:

Mid-Senior level

Job Description:

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

Role:

As a Customer Enablement Engineer specializing in Data Engineering for AI, you will design, develop, and deliver comprehensive curriculum content, including student guides, labs, quizzes, and certifications on data engineering and data preparation skills. This curriculum will enable Cloudera customers to effectively build AI systems on the Cloudera Hybrid platform.

Objective of this Role:

To ensure customers are successfully enabled to prepare data with high quality that meets the requirements to efficiently build their ML/AI including LLMs.

As the Data Engineer for AI you will:

  • Responsible for developing high quality and impactful “data engineering for AI” course
  • Enable instructors to successfully deliver the course in classrooms to our customers
  • Deliver hands-on workshops to customers in person or remote on select course topics
  • Record and publish course content as online modules in digital format
  • Work with internal & external SMEs and Customers to regularly seek inputs for improvement
  • Assist Edu sales leaders to sell Educational products by being a technical resources
  • Own your own self development and stay resourceful all the time. Enrich your own knowledge on various topics in data analytics and AI by being a self-learner .


We’re excited about you if you have:

  • Five (5) or more years of data engineering experience with SQL, Python, Hive, Spark, Flink, Kafka, Nifi and Airflow.
  • Hands-on experience in developing data ingest (batch and realtime) pipelines from various data sources into large analytics platforms, data warehouses, data lakes and lake houses
  • Experience with one or more LMS (learning management systems)
  • Experience or educated in preparing data ( both structured and unstructured ) for ML/AI model development including training and fine tuning of LLMs
  • Experience with data governance, data lineage, and metadata best practices
  • Experienced using data quality & data profiling tools and data catalogs
  • Experience in having published technology education content on digital media platforms like Udemy, LinkedIn, YouTube or own website etc as Curriculum Developer or independent contributor
  • Experience in working in public cloud environments from one of the hyperscalers like AWS, Google Cloud and Microsoft Azure). A cloud certification is preferred
  • Experience working with containers and Kubernetes. A certification in Kubernetes is preferred
  • Experience in (or trained on) the Cloudera platform (CDP, HDP or CDH ) and any underlying Apache projects
  • Experience or training in preparing data for ML/AI model development including LLMs
  • Experience or training on Iceberg, Trino and Vector databases like Pinecone orMilvus
  • Experience using configuration management tools such as Git, Ansible, Puppet or Chef
  • Familiarity with scripting tools such as bash shell scripts, Python and/or Perl


Soft Skills Essential

  • Ability to work closely with the curriculum content development team to define the operational requirements for technical training courses
  • Ability to build efficient, well-architected, easy-to-use hands-on lab environments
  • Ability to work as part of a remote, distributed team


It is a plus if you have:

  • Certification in cloud on at least one hypescaler: AWS, Azure, or GCP
  • Expertise in preprocessing unstructured data for generative AI, including tokenization and embedding generation
  • Proficiency with one or more vector databases (e.g., Pinecone, Milvus) for managing embeddings in semantic search and data retrieval.
  • Skills in handling large-scale datasets for LLMs, including sharding, distributed loading, and parallel data processing.
  • Knowledge of data lineage, versioning, and metadata tracking to ensure compliant, high-quality training data for generative AI.


What you can expect from us:

  • Generous PTO Policy
  • Support work life balance with Unplugged Days
  • Flexible WFH Policy
  • Mental & Physical Wellness programs
  • Phone and Internet Reimbursement program
  • Access to Continued Career Development
  • Comprehensive Benefits and Competitive Packages
  • Paid Volunteer Time
  • Employee Resource Groups


Cloudera is an Equal Opportunity / Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.

Share job
Similar Jobs
View All
1 Day ago
TrueFan - Senior Machine Learning Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
About UsTrueFan is at the forefront of AI-driven content generation, leveraging cutting-edge generative models to build next-generation products. Our mission is to redefine content generation space through advanced AI technologies, including deep ge...
decor
1 Day ago
Salesforce commerce cloud consultant
Information Technology
  • Thiruvananthapuram, Kerala, India
Salesforce Commerce Cloud consultant  5+ Years of Experience 6 to 12 months Mode - Remote 1.1LPM - 1.2LPM Max Key Responsibilities Translate business requirements into scalable Salesforce Service Cloud solutions, in collaboration with CAE's technic...
decor
1 Day ago
Cloud Infrastructure Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
DescriptionInvent the future with us. Recognized by Fast Company’s 2023 100 Best Workplaces for Innovators List, Ampere is a semiconductor design company for a new era, leading the future of computing with an innovative approach to CPU design focuse...
decor
1 Day ago
Devops Engineer- Intermetiate
Information Technology
  • Thiruvananthapuram, Kerala, India
BackJD: Dev ops Engineer:As a DevOps Specialist- should be able to take ownership of the entire DevOps process, including Automated CI/CD pipelines and deployment to production.They should also be comfortable with risk analysis and prioritization.Le...
decor
1 Day ago
Sr Data Scientist (London)
Information Technology
  • Thiruvananthapuram, Kerala, India
AryaXAI stands at the forefront of AI innovation, revolutionizing AI for mission-critical, highly regulated industries by building explainable, safe, and aligned systems that scale responsibly. Our mission is to create AI tools that empower research...
decor
1 Day ago
Software Test Engineer
Information Technology
  • Thiruvananthapuram, Kerala, India
By clicking the “Apply” button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takeda’s Privacy Notice and Terms of Use. I further att...
decor
1 Day ago
Software Developer 5 (Java Fullstack)
Information Technology
  • Thiruvananthapuram, Kerala, India
Job DescriptionBuilding off our Cloud momentum, Oracle has formed a new organization - Oracle Health Applications & Infrastructure. This team focuses on product development and product strategy for Oracle Health, while building out a complete platfo...
decor
1 Day ago
Java Developer - Spring Frameworks
Information Technology
  • Thiruvananthapuram, Kerala, India
Java DescriptionWe are looking for a passionate and talented Java Developer with 2-3 years of hands-on experience to join our growing development team.The ideal candidate should have a strong foundation in Java technologies and the ability to develo...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media