Overview
This is a high-ownership “do whatever it takes” role for someone who wants to operate at founder speed, learn the full stack of an insurance/warranty business, and ship work that directly moves revenue, conversion, and retention.
What you’ll do
You will build the agentic layer of our core product: AI systems that reason, take actions, and reliably complete workflows across pricing/underwriting, policy issuance, claims intake, adjudication, fulfillment (repair/replacement/reimbursement), and other parts of the bueinsess.
Key responsibilities
Design and ship production-grade AI agents that run real business processes (not demos)
Build agentic architectures: orchestration, tool calling, state machines, memory, permissions, audit trails, human-in-the-loop, and fallback paths
Own our RAG platform end-to-end: ingestion, chunking, embeddings, retrieval, reranking, citations/grounding, and hallucination mitigation
Build evaluation and monitoring systems: offline eval sets, regression tests, online metrics, drift detection, and red-team suites
Implement model optimization: prompt systems, structured outputs, fine-tuning where appropriate, latency/cost optimization, caching, and throughput tuning
Build core ML systems for warranty/claims: document understanding, extraction, classification, anomaly/fraud signals, decision support, and SLA routing
Partner tightly with product/ops to translate real workflows into deterministic, testable, compliant automation
What you’ll build (examples)
Underwriting/pricing agents: real-time quote decisions using merchant/product/context signals with strict guardrails and auditability
Claims copilot + auto-adjudication engine: intake triage, evidence requests, decision proposals with explanation, vendor routing, reimbursement automation
OEM warranty parsing system: turn messy manufacturer policies into machine-readable coverage logic
Internal ops copilots: tooling that reduces manual work and increases consistency across customer support, compliance, and finance
Requirements (must have)
(Hiring at different levels for the same role - required experience years, expected skill level will vary as per role level)
1+ years building and shipping ML/LLM systems in production (or equivalent founder-level experience)
Proven experience building agentic products/companies: multi-step workflows, tool use, orchestration, reliability engineering
Deep hands-on expertise in:
RAG and retrieval systems (vector databases, reranking, grounding strategies)
LLM evals (golden sets, automated judging, human eval, regression pipelines)
Prompting and structured outputs (schemas, function/tool calling, robustness)
Model training/fine-tuning fundamentals and tradeoffs (when to tune vs prompt vs retrieve)
Strong software engineering: clean APIs, testing, observability, performance tuning, secure-by-default design
Comfortable owning ambiguous problems end-to-end and driving them to measurable outcomes
Strong preference (nice to have)
Experience building systems with compliance/audit requirements (fintech/insurance/health/enterprise)
Experience with document AI at scale (PDFs, images, messy inputs), and extracting structured truth reliably
Experience designing human-in-the-loop workflows and escalation rules for high-stakes decisions
Experience with infra for LLMs: model hosting, batching, streaming, caching, prompt/version management
Startup or ex-founder background, especially shipping 0→1 products fast
What success looks like (first 90 days)
You ship an agentic workflow that replaces meaningful manual ops work and improves a measurable metric (cycle time, accuracy, cost per claim, attach rate, CSAT)
You implement an eval harness that catches regressions before production and gives us a reliable “quality score” per workflow
You establish a scalable architecture pattern for agents (permissions, audit logs, observability, fallbacks) that the team can replicate
Tech environment
We’re cloud-native and move fast. Expect Python for ML/agents, TypeScript for product surfaces, Postgres for systems of record, event-driven services, and a modern LLM + retrieval stack with strong observability and CI/CD. And AWS+Azure for infra.
Why this role is special
Build an AI-native category-defining company in a massive market
Direct founder exposure and high leverage: your work will change the trajectory of the company
Real breadth: growth + underwriting/claims ops + product, in one seat
Career accelerant: if you perform, your scope and title will grow quickly
How to Apply
Please ensure your profile is up to date and includes a link to your LinkedIn.
In your application message, share 3 things you’ve built or delivered with the results you achieved in one simple sentence per example (3 sentences total).
About the interview
Interview process
Intro call (15 minutes)
2 remote interviews (45 minutes)
2 In person interviews (1hour)
References