Overview
Key Responsibilities
Shift: 5 hours overlap with PST.
Build and optimize data pipelines to extract and transform metadata from PDFs,
documents, and other unstructured formats using OCR tools (Textract, Tesseract).
Integrate image/video/audio analysis pipelines using tools such as OpenCV, Rekognition,
and Whisper.
Index structured and unstructured content using Elasticsearch or OpenSearch for fast,
scalable search.
Design and implement RESTful APIs for client access to the indexed data.
Implement initial support for semantic and vector search using FAISS or Weaviate (as
applicable).
Ensure tenant-specific data isolation, access control, and search result relevance.
Relevant Experience & Qualifications
3–6 years of experience in backend development and/or data engineering roles.
Strong programming skills in Python (preferred), with experience in building APIs and
ETL pipelines.
Hands-on experience with search indexing and retrieval using Elasticsearch or similar
tools.
Experience handling and processing unstructured data (text, image, video).
Familiarity with vector-based search techniques and libraries.
Experience with AWS and containerized deployment (Docker, ECS, Lambda, etc.) is a
plus.
Strong collaboration skills and ability to work in a small, focused team aligned to tight
delivery timelines.
Job Types: Full-time, Permanent
Pay: ₹2,200,000.00 - ₹2,500,000.00 per year
Benefits:
- Health insurance
- Provident Fund
Schedule:
- US shift
Supplemental Pay:
- Performance bonus
Application Question(s):
- Are you an immediate joiner?
- What is your notice period?
Experience:
- Back-end development: 5 years (Preferred)
- Data Engineering: 5 years (Preferred)
Work Location: Remote