Free cookie consent management tool by TermsFeed Job Title : Data Engineer / AI Data Pipeline Engineer | Antal Tech Jobs
Back to Jobs
1 Day ago

Job Title : Data Engineer / AI Data Pipeline Engineer

decor
Delhi, DL, India
Information Technology
Full-Time
Web Spiders

Overview

We are looking for a hands-on Data Engineer / AI Data Pipeline Engineer to join our growing engineering team. You'll work on cutting-edge AI-powered data enrichment, taxonomy validation, and scalable reporting frameworks across large-scale retail and enterprise datasets. The role sits at the intersection of data engineering and applied LLMs, and requires strong skills in Python, SQL, AWS cloud services, modern ETL architecture, and LLM-powered automation workflows.

Location
Kolkata, Rajarhat
Type
Full Time
Department
IT Technology

We are looking for a hands-on Data Engineer / AI Data Pipeline Engineer to join our growing engineering team. You'll work on cutting-edge AI-powered data enrichment, taxonomy validation, and scalable reporting frameworks across large-scale retail and enterprise datasets. The role sits at the intersection of data engineering and applied LLMs, and requires strong skills in Python, SQL, AWS cloud services, modern ETL architecture, and LLM-powered automation workflows. ‍ Experience: 3–4 years ‍Location: Rajarhat-Newtown (Kolkata) ‍Employment Type: Full-time, Onsite ‍Timing: Ability to work in the US Eastern time zone. This may be relaxed to half day IST and half day US EST - based on project needs. ‍Documents : Must have Aadhar Card, Education Certificates that are verifiable, Past company letters ( if applicable) and criminal background clearance. ‍ Key Skills Required: AI-Powered Taxonomy Audit & Enrichment: Design and develop scalable, AI-driven taxonomy audit pipelines for retail store and brand data validation. Build automated workflows leveraging LLMs (GPT-4o / OpenAI APIs) for classification, enrichment, and ontology standardization, using Instructor and Pydantic for reliable structured outputs. Integrate web research and scraping systems (Serper API, ScrapingBee, html2text) to validate structured and unstructured data. Develop human-in-the-loop review workflows using Label Studio for confirm/edit/reject audit processes. Improve taxonomy coverage and entity-resolution accuracy through AI-assisted clustering and enrichment of unmapped transaction data. Data Engineering & Pipeline Development: Build and maintain modular, reusable ETL/data pipeline frameworks. Refactor legacy reporting systems into modern, maintainable architectures with reusable SQL modules and query builders. Develop validation frameworks, logging systems, automated migration workflows, and configurable comparison contexts. Orchestrate workflows with Apache Airflow (DAGs, PythonOperator, XCom) and cloud-native AWS services. Ensure backward compatibility and production stability during migration initiatives. Reporting & Cloud Infrastructure: Develop and optimize advanced SQL queries and reporting pipelines on Amazon Redshift / Redshift Serverless and PostgreSQL (RDS). Manage data workflows using AWS services including S3, Lambda, Glue, CloudWatch, SSM Parameter Store, and Secrets Manager. Monitor production pipelines, troubleshoot issues, and improve performance and reliability. Collaborate with cross-functional teams across Data Engineering, AI/ML, QA, and Product. ‍ Required Skills & Experience: 3–4 years of experience in Python-based data engineering or backend engineering. Strong proficiency in Python, including pandas, requests, psycopg2, and boto3, with solid modular application development. Hands-on experience with Apache Airflow (DAGs, PythonOperator, XCom). Strong advanced SQL skills and a solid grasp of data warehousing concepts. Experience with Amazon Redshift and PostgreSQL. Sound understanding of ETL/data pipeline architecture and workflow orchestration. Hands-on experience with AWS services: S3, Lambda, Glue, CloudWatch, SSM Parameter Store, and Secrets Manager. Experience integrating LLM APIs (GPT-4o / OpenAI) into production workflows. Familiarity with web scraping, search APIs, and data enrichment systems. Experience with Git/GitHub, Jira, and Confluence. Strong debugging, problem-solving, and analytical skills. ‍ Good to Have: Experience with Instructor, Pydantic, or AI workflow orchestration frameworks. Exposure to Label Studio or other human-review annotation systems. Experience with AI-assisted entity resolution and taxonomy/ontology systems. Familiarity with scalable, modular ETL framework design. Background in retail transaction data or taxonomy/master-data management. ‍ Tech Stack: Languages & Libraries: Python, advanced SQL, pandas, boto3, psycopg2, requests Orchestration: Apache Airflow AI / LLM: GPT-4o / OpenAI APIs, Instructor, Pydantic Data & Warehousing: Amazon Redshift / Redshift Serverless, PostgreSQL (RDS) AWS: S3, Lambda, Glue, CloudWatch, SSM Parameter Store, Secrets Manager Scraping & Search: Serper API, ScrapingBee, html2text Human Review: Label Studio Collaboration: Git/GitHub, Jira, Confluence ‍ Preferred Candidate Profile: Self-driven, with end-to-end ownership of data workflows. Comfortable in fast-paced AI/data engineering environments. Strong communication and collaboration skills. Passionate about building scalable, AI-assisted automation systems.

Share job
Similar Jobs
View All
1 Day ago
Java Full Stack Developer with AWS
Information Technology
  • Delhi, DL, India
Company Profile: Founded in 1976, CGI is among the largest independent IT and business consulting services firms in the world. With 94,000 consultants and professionals across the globe, CGI delivers an end-to-end portfolio of capabilities, from stra...
decor
1 Day ago
Cloud Engineer
Information Technology
  • 1000000 - 1300000 INR - Yearly
  • Delhi, DL, India
Position: Cloud System EngineerJob Description:Plus91 is looking for a Cloud System Engineer to be a core member of our IT Support Team. You can expect to be challenged and grow in a dynamic environment with technological advancements. You will work ...
decor
1 Day ago
Senior Cloud Engineer
Information Technology
  • Delhi, DL, India
Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant services. Our worldwide team helps over 3 million companies, more than 1...
decor
1 Day ago
Lead Software Engineer - Java with Azure
Information Technology
  • Delhi, DL, India
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will c...
decor
1 Day ago
Senior Data Engineer
Information Technology
  • Delhi, DL, India
Ciklum is looking for a Senior Data Engineer to join our team full-time in India. We are a custom product engineering company that supports both multinational organizations and scaling startups to solve their most complex business challenges. W...
decor
1 Day ago
Principal Software Engineer
Information Technology
  • Delhi, DL, India
About Team: RX Global aims to create unforgettable experiences for attendees and exhibitors through organizing events. Innovation, creativity, and collaboration drive the company to offer exceptional services to clients. About the role: The Principal...
decor
1 Day ago
Business Analyst I, GTS Field - Kadie Newman
Information Technology
  • Delhi, DL, India
DESCRIPTION About the Team The TESS (Transportation Execution Systems & Services) Analytics team is part of NA Transportation Services and is responsible for delivering data-driven insights, dashboards, automation, and governance solutions that supp...
decor
1 Day ago
Data Engineer
Information Technology
  • Delhi, DL, India
Project Role : Data Engineer Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes...
decor

Talk to us

Feel free to call, email, or hit us up on our social media accounts.
Social media