Overview
Job Description
We're looking for a hard-working, thoughtful Data Scientist who delivers results. You'll transform messy data into useful models and decision-making tools, collaborate closely with engineering and business development teams, and own projects from start to finish—including scoping, deployment, and monitoring.
In Business Development at Millennium, we identify and onboard trading talent and data advantages that drive profitability. We evaluate data vendor solutions and surface novel data points that create early, differentiated signals ahead of competitors. Some work involves delivering reliable, established solutions, while much of it is creative—extracting hidden insights from existing data and discovering new sources.
Strong Python skills are required. Hands-on experience with large language models (LLMs) and their practical applications is a significant plus. We value humility, reliability, clear communication, and collaborative teamwork.
Responsibilities:
Own outcomes: break projects into milestones, estimate realistically, meet deadlines, and surface risks early with options.
Build and deploy models: develop ML models (including LLM-enabled solutions when appropriate), design features, evaluate rigorously, and contribute to production-grade pipelines.
Work the full data lifecycle: wrangle and clean data, implement data quality checks, write maintainable Python/SQL, add tests, and document decisions.
Collaborate cross-functionally: translate business questions into analysis, present trade-offs clearly, and iterate with stakeholders and tech partners.
Deploy to production: partner with data engineers to ship to production (APIs, batch jobs, monitoring) and create feedback loops for continuous improvement.
Drive system improvements: propose pragmatic improvements to data, tooling, and process that reduce manual work and increase reliability.
Qualifications/Skills Required
Bachelor's degree in Computer Science, Data Science, Statistics, or a related field.
3+ years of hands-on experience in data science, analytics, or ML, with at least one end-to-end project shipped to production used by real stakeholders.
Strong Python proficiency (Pandas, NumPy, Matplotlib, and Scikit-learn).
Sound understanding of ML fundamentals: problem framing, validation, metrics, overfitting, feature engineering.
Experience with SQL and NoSQL database structures along with relational, columnar and document databases.
Experience with LLMs and modern NLP: prompt engineering, retrieval-augmented generation (RAG), vector databases, knowledge graphs—plus a pragmatic sense of when not to use them.
Strong data visualization and storytelling abilities using tools such as Plotly, Dash, Streamlit, or similar frameworks.
Familiarity with cloud services (AWS/Azure) for data storage, compute, and deployment.
How we work (what success looks like)
Strong ownership: Plan thoroughly, communicate proactively, and meet deadlines—we value reliability over last-minute heroics
Humble and collaborative: Seek and provide constructive feedback, write clear documentation, and pair program when it helps the team
Bias toward action: Start simple, deliver iteratively, and improve based on evidence