Free cookie consent management tool by TermsFeed Data Engineer for AI | Antal Tech Jobs
Back to Jobs
16 Weeks ago

Data Engineer for AI

decor
Pune, Maharashtra, India
Information Technology
Full-Time
Cloudera

Overview

Business Area:
Professional Services
Seniority Level:
Mid-Senior level
Job Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
Role:
As a Customer Enablement Engineer specializing in Data Engineering for AI, you will design, develop, and deliver comprehensive curriculum content, including student guides, labs, quizzes, and certifications on data engineering and data preparation skills. This curriculum will enable Cloudera customers to effectively build AI systems on the Cloudera Hybrid platform.
Objective of this Role:
To ensure customers are successfully enabled to prepare data with high quality that meets the requirements to efficiently build their ML/AI including LLMs.
As the Data Engineer for AI you will:
  • Responsible for developing high quality and impactful “data engineering for AI” course
  • Enable instructors to successfully deliver the course in classrooms to our customers
  • Deliver hands-on workshops to customers in person or remote on select course topics
  • Record and publish course content as online modules in digital format
  • Work with internal & external SMEs and Customers to regularly seek inputs for improvement
  • Assist Edu sales leaders to sell Educational products by being a technical resources
  • Own your own self development and stay resourceful all the time. Enrich your own knowledge on various topics in data analytics and AI by being a self-learner .
We’re excited about you if you have:
  • Five (5) or more years of data engineering experience with SQL, Python, Hive, Spark, Flink, Kafka, Nifi and Airflow.
  • Hands-on experience in developing data ingest (batch and realtime) pipelines from various data sources into large analytics platforms, data warehouses, data lakes and lake houses
  • Experience with one or more LMS (learning management systems)
  • Experience or educated in preparing data ( both structured and unstructured ) for ML/AI model development including training and fine tuning of LLMs
  • Experience with data governance, data lineage, and metadata best practices
  • Experienced using data quality & data profiling tools and data catalogs
  • Experience in having published technology education content on digital media platforms like Udemy, LinkedIn, YouTube or own website etc as Curriculum Developer or independent contributor
  • Experience in working in public cloud environments from one of the hyperscalers like AWS, Google Cloud and Microsoft Azure). A cloud certification is preferred
  • Experience working with containers and Kubernetes. A certification in Kubernetes is preferred
  • Experience in (or trained on) the Cloudera platform (CDP, HDP or CDH ) and any underlying Apache projects
  • Experience or training in preparing data for ML/AI model development including LLMs
  • Experience or training on Iceberg, Trino and Vector databases like Pinecone orMilvus
  • Experience using configuration management tools such as Git, Ansible, Puppet or Chef
  • Familiarity with scripting tools such as bash shell scripts, Python and/or Perl
Soft Skills Essential
  • Ability to work closely with the curriculum content development team to define the operational requirements for technical training courses
  • Ability to build efficient, well-architected, easy-to-use hands-on lab environments
  • Ability to work as part of a remote, distributed team
It is a plus if you have:
  • Certification in cloud on at least one hypescaler: AWS, Azure, or GCP
  • Expertise in preprocessing unstructured data for generative AI, including tokenization and embedding generation
  • Proficiency with one or more vector databases (e.g., Pinecone, Milvus) for managing embeddings in semantic search and data retrieval.
  • Skills in handling large-scale datasets for LLMs, including sharding, distributed loading, and parallel data processing.
  • Knowledge of data lineage, versioning, and metadata tracking to ensure compliant, high-quality training data for generative AI.
What you can expect from us:
  • Generous PTO Policy
  • Support work life balance with
    Unplugged Days
  • Flexible WFH Policy
  • Mental & Physical Wellness programs
  • Phone and Internet Reimbursement program
  • Access to Continued Career Development
  • Comprehensive Benefits and Competitive Packages
  • Paid Volunteer Time
  • Employee Resource Groups
Cloudera is an Equal Opportunity / Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.
#LI-Hybrid
#LI-SN1
Share job
Similar Jobs
View All
36 Minutes ago
Application Developer
Information Technology
  • 1 - 5 Yrs
  • Anywhere in India/Multiple Locations
Project Role :Application Developer Must have skills :.Net Full Stack Development Summary:As an Application Developer, you will engage in the design, construction, and configuration of applications tailored to fulfill specific business processe...
decor
1 Day ago
SAP Finance Director
Finance & Banking
  • 5500000 - 6000000 INR - Yearly
  • 18 - 21 Yrs
  • Mumbai, Bangalore, Hyderabad, Pune
Responsibility You are a key player in large S/4HANA transformation programmes for global customers (team size of 50+ people) You have more than 15 years of experience in SAP either in value stream leadership, business, or domain expert You...
decor
1 Day ago
DevOps Engineer
Information Technology
  • Bangalore, Karnataka, India
DescriptionTo deliver and maintain IT-applications and –services in order to realize the strategy in the field of information technology. Engineers in this job category work in an agile way, in squads to deliver short-cycle full-fledged IT products....
decor
1 Day ago
ChicMic Studios - Android Developer - Kotlin
Information Technology
  • Gurugram, Haryana, India
Job DescriptionKey Responsibilities : Design and build advanced applications for the Android platform using Kotlin. Collaborate with cross-functional teams to define, design, and ship new features. Work with outside data sources and APIs (REST, J...
decor
1 Day ago
Optimum Info - AWS Cloud Infrastructure Engineer
Information Technology
  • Gurugram, Haryana, India
Job DescriptionAt Optimum Info, we are continually innovating and developing a range of software solutions empowering the Network Development and Field Operations businesses at Automotive, Power Sports and Equipment industries. Our integrated suite ...
decor
1 Day ago
Azure Cloud Architect
Information Technology
  • Gurugram, Haryana, India
Job DescriptionJob Title: Azure Cloud ArchitectAbout the Company/TeamOracle FSGIU's Finergy division is a specialized team dedicated to transforming the Banking, Financial Services, and Insurance (BFSI) industry through innovative technology solutio...
decor
1 Day ago
Java Full Stack Developer (Spring boot, Rest API, Angular)_4+Yrs_Bangalore/Pune/Indore
Information Technology
  • Gurugram, Haryana, India
Are you ready to write your next chapter?Make your mark at one of the biggest names in payments. We’re looking for a Java Developer to join our ever evolving IQ Portal team. and help us unleash the potential of every business.What You’ll Own As The ...
decor
1 Day ago
Full Stack Developer
Information Technology
  • Gurugram, Haryana, India
Responsibilities: Develop and maintain scalable full stack applications using .NET Core 8.0, C#, MVC/Blazor/Razor. Design, develop, and secure RESTful APIs with best practices in authentication and authorization. Hands on experience in JavaScript....
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media