Overview
About the internship:
Tritone Analytics is building a next-generation royalty auditing platform for the music industry. We help artists, managers, and rights-holders identify unpaid or misreported royalties by combining deterministic data processing with modern AI systems.We are seeking a 'Vector Systems & Retrieval Engineering Intern' to support the design, evaluation, and implementation of vector-based retrieval systems that power our audit workflows. This is not a generic 'AI' role. You will work on embeddings, vector databases, and retrieval quality, helping ensure that downstream LLM and agent workflows are grounded in correct, complete context. You'll work closely with engineers on real production-oriented systems used to analyze contracts, royalty statements, and metadata at scale.
Selected intern's day-to-day responsibilites include:
1. Generating and managing text embeddings for contracts, statements, and metadata.
2. Supporting vector database design and indexing (e.g., OpenSearch, pgvector, Qdrant-style systems).
3. Comparing and evaluating embedding models (dimension size, cost, recall trade-offs).
4. Assisting with re-embedding and re-indexing workflows.
5. Measuring retrieval quality (precision, recall, latency).
6. Supporting hybrid retrieval pipelines (vector + structured/graph-based context).
7. Helping document best practices for chunking strategies, vector schema design, and retrieval evaluation.
8. Supporting LLM retrieval pipelines (RAG) with clean, reliable vector inputs.
Who can apply:
- are available for the work from home job/internship
- can work from 8:30 pm - 12:30 am Indian Standard Time (as the company is based outside of India & their local work timings are 10:00 am - 2:00 pm Eastern Standard Time)
- can start the work from home job/internship between 16th Jan'26 and 20th Feb'26
- are available for duration of 2 months
- have relevant skills and interests
- * Women wanting to start/restart their career can also apply.
- are Computer Science Engineering students
Only those candidates can apply who:
Stipend:
USD$ 300 - 400 /monthDeadline:
2026-02-15 23:59:59Skills required:
Python, Machine Learning, Natural Language Processing (NLP), Database Management System (DBMS), Artificial intelligence and Data EngineeringOther Requirements:
1. Ability to structure unstructured data, with experience transforming raw text (contracts, documents, statements, logs) into structured or semi-structured formats suitable for embeddings, indexing, and analysis.
2. Python, comfortable reading and modifying scripts.
3. Basic SQL proficiency, with the ability to write simple to intermediate queries to inspect datasets, validate retrieval results, and cross-check vector search outputs against structured data.
4. Basic understanding of embeddings, vector similarity (cosine/dot product), and NLP fundamentals.
5. Familiarity with at least one vector database (OpenSearch, Pinecone, Weaviate, Qdrant, FAISS, pgvector).
6. Familiarity with sentence-transformers or embedding APIs.
7. Comfort working with unstructured text data.
8. Experience with OpenSearch or AWS-adjacent tooling.
9. Experience with LangChain, LlamaIndex, or retrieval pipelines.
10. Experience with basic ETL or data pipelines.
11. Understanding dimensionality trade-offs.
12. Understanding retrieval precision versus recall.
13. Exposure to knowledge graphs or hybrid retrieval concepts.
14. Interest in AI infrastructure rather than prompt engineering.
15. Potential and interest in growing into a full-time member of the team.
About Company:
Tritone Analytics is an automated music royalty auditing platform, as well as a unified Music Industry accounting and analytics platform.