Chennai, Tamil Nadu, India
Information Technology
Other
PSI India

Overview
ID: 347 | 2-5 yrs | Jaipur | careers
Data Engineer + AI
Job Summary:
We are looking for a skilled and versatile Data Engineer with expertise in PySpark, Apache Spark, and Databricks, along with experience in analytics, data modeling, and Generative AI/Agentic AI solutions. This role is ideal for someone who thrives at the intersection of data engineering, AI systems, and business insights—contributing to high-impact programs with clients.
Required Skills & Experience:
- Advanced proficiency in PySpark, Apache Spark, and Databricks for batch and streaming data pipelines.
- Strong experience with SQL for data analysis, transformation, and modeling.
- Expertise in data visualization and dashboarding tools (Power BI, Tableau, Looker).
- Solid understanding of data warehouse design, relational databases (PostgreSQL, Snowflake, SQL Server), and data lakehouse architectures.
- Exposure to Generative AI, RAG, embedding models, and vector databases (e.g., FAISS, Pinecone, ChromaDB).
- Experience with Agentic AI frameworks: LangChain, Haystack, CrewAI, or similar.
- Familiarity with cloud services for data and AI (Azure, AWS, or GCP).
- Excellent problem-solving and collaboration skills with an ability to bridge engineering and business needs.
Preferred Skills:
- Experience with MLflow, Delta Live Tables, or other Databricks-native AI tools.
- Understanding of prompt engineering, LLM deployment, and multi-agent orchestration.
- Knowledge of CI/CD, Git, Docker, and DevOps pipelines.
- Awareness of Responsible AI, data privacy regulations, and enterprise data compliance.
- Background in consulting, enterprise analytics, or AI/ML product development.
Key Responsibilities:
- Design, build, and optimize distributed data pipelines using PySpark, Apache Spark, and Databricks to support both analytics and AI workloads.
- Support RAG pipelines, embedding generation, and data pre-processing for LLM applications.
- Create and maintain interactive dashboards and BI reports using Power BI, Tableau, or Looker for business stakeholders and consultants.
- Conduct adhoc data analysis to drive data-driven decision making and enable rapid insight generation.
- Develop and maintain robust data warehouse schemas, star/snowflake models, and support data lake architecture.
- Integrate with and support LLM agent frameworks such as LangChain, LlamaIndex, Haystack, or CrewAIfor intelligent workflow automation.
- Ensure data pipeline monitoring, cost optimization, and scalability in cloud environments (Azure/AWS/GCP).
- Collaborate with cross-functional teams including AI scientists, analysts, and business teams to drive use-case delivery.
- Maintain strong data governance, lineage, and metadata management practices using tools like Azure Purview or DataHub.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in