Bangalore, Karnataka, India
Space Exploration & Research, Information Technology
Full-Time
Global Payments Inc.
Overview
Description
RESPONSIBILITIES
RESPONSIBILITIES
- Design, develop, implement, test, and maintain scalable and efficient data pipelines for large scale structured and unstructured datasets, including document, image, and event data used in GenAI and ML use cases..
- Collaborate closely with data scientists, AI/ML engineers, MLOps and Product Owners to understand data requirements and ensure data availability and quality.
- Build and optimize data architectures for both batch and real-time processing.
- Develop and maintain data warehouses and data lakes to store and manage large volumes of structured and unstructured data.
- Implement data validation and monitoring processes to ensure data integrity.
- Implement and manage vector databases (eg. pgVector, Pinecone, FAISS, etc) and embedding pipelines to support retrieval-augmented architectures.
- Support data sourcing and ingestion strategies, including API, data lakes, and message queues
- Enforce data quality, lineage, observability, and governance standards for AI workloads
- Work with cross-functional IT and business teams in an Agile environment to deliver successful data solutions.
- Help foster a data-driven culture via information sharing, design for scalability, and operational efficiency.
- Stay updated with the latest trends and best practices in data engineering and big data technologies.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in data engineering or a related field.
- Experience with the following:
- Strong programming skills in Python (and optionally Scala or Java) with Spark, Airflow, or similar orchestration tools.
- Deep experience with cloud data platforms (eg Databricks, GCP BigQuery, Snowflake, AWS Glue)
- Strong familiarity with LLM workflows, RAG solutions, embeddings, reranking, and vector search concepts.
- Proficiency in SQL and experience with data modeling for AI/ML use cases
- Experience with NoSQL databases (MongoDB, Cassandra, or similar).
- Knowledge of containerization technologies like Docker and orchestration systems like Kubernetes.
- Cloud platform experience; GCP, AWS or Azure are acceptable.
- Understanding of responsible AI principles as applied to data sourcing and processing
- Excellent problem-solving and analytical skills.
- Excellent communication and collaboration skills.
- Experience with real-time data processing frameworks (Kafka, Flink, etc.).
- Experience working with data scientists on machine learning projects.
- Experience supporting generative AI model training or inference in production environments
- Knowledge of LLM and integration with foundation AI platforms (eg AWS Bedrock, Google VertexAI, Snowflake Cortex, Azure OpenAI)
- Hands-on exposure and understanding of LangChain, LangGraph, CrewAI, or similar orchestration frameworks
- Familiarity with machine learning frameworks (TensorFlow, PyTorch, scikit-learn).
- Experience with CI/CD tools and practices.
- Knowledge of data governance and data security best practices.
- Certifications in data engineering or cloud technologies.
- Ability to work with a high level of initiative, accuracy, and attention to detail.
- Ability to prioritize multiple assignments effectively. Ability to meet established deadlines.
- Ability to successfully, efficiently, and professionally interact with staff and customers.
- Excellent organization skills.
- Critical thinking ability ranging from moderately to highly complex.
- Flexibility in meeting the business needs of the customer and the company.
- Ability to work creatively and independently with latitude and minimal supervision.
- Ability to utilize experience and judgment in accomplishing assigned goals.
- Experience in navigating organizational structure.
Similar Jobs
View All
Talk to us
Feel free to call, email, or hit us up on our social media accounts.
Email
info@antaltechjobs.in