Overview
This is a high-ownership “do whatever it takes” role for someone who wants to operate at founder speed, learn the full stack of an insurance/warranty business, and ship work that directly moves revenue, conversion, and retention.
What you’ll do\ You will build the agentic layer of our core product: AI systems that reason, take actions, and reliably complete workflows across pricing/underwriting, policy issuance, claims intake, adjudication, fulfillment (repair/replacement/reimbursement), and other parts of the bueinsess.
\ Key responsibilities
- Design and ship production-grade AI agents that run real business processes (not demos)
- Build agentic architectures: orchestration, tool calling, state machines, memory, permissions, audit trails, human-in-the-loop, and fallback paths
- Own our RAG platform end-to-end: ingestion, chunking, embeddings, retrieval, reranking, citations/grounding, and hallucination mitigation
- Build evaluation and monitoring systems: offline eval sets, regression tests, online metrics, drift detection, and red-team suites
- Implement model optimization: prompt systems, structured outputs, fine-tuning where appropriate, latency/cost optimization, caching, and throughput tuning
- Build core ML systems for warranty/claims: document understanding, extraction, classification, anomaly/fraud signals, decision support, and SLA routing
- Partner tightly with product/ops to translate real workflows into deterministic, testable, compliant automation
\ What you’ll build (examples)
- Underwriting/pricing agents: real-time quote decisions using merchant/product/context signals with strict guardrails and auditability
- Claims copilot + auto-adjudication engine: intake triage, evidence requests, decision proposals with explanation, vendor routing, reimbursement automation
- OEM warranty parsing system: turn messy manufacturer policies into machine-readable coverage logic
- Internal ops copilots: tooling that reduces manual work and increases consistency across customer support, compliance, and finance
- Requirements (must have)**\ (Hiring at different levels for the same role - required experience years, expected skill level will vary as per role level)
- 1+ years building and shipping ML/LLM systems in production (or equivalent founder-level experience)
- Proven experience building agentic products/companies: multi-step workflows, tool use, orchestration, reliability engineering
- Deep hands-on expertise in:
- RAG and retrieval systems (vector databases, reranking, grounding strategies)
- LLM evals (golden sets, automated judging, human eval, regression pipelines)
- Prompting and structured outputs (schemas, function/tool calling, robustness)
- Model training/fine-tuning fundamentals and tradeoffs (when to tune vs prompt vs retrieve)
- Strong software engineering: clean APIs, testing, observability, performance tuning, secure-by-default design
- Comfortable owning ambiguous problems end-to-end and driving them to measurable outcomes
\ Strong preference (nice to have)
- Experience building systems with compliance/audit requirements (fintech/insurance/health/enterprise)
- Experience with document AI at scale (PDFs, images, messy inputs), and extracting structured truth reliably
- Experience designing human-in-the-loop workflows and escalation rules for high-stakes decisions
- Experience with infra for LLMs: model hosting, batching, streaming, caching, prompt/version management
- Startup or ex-founder background, especially shipping 0→1 products fast
\ What success looks like (first 90 days)
- You ship an agentic workflow that replaces meaningful manual ops work and improves a measurable metric (cycle time, accuracy, cost per claim, attach rate, CSAT)
- You implement an eval harness that catches regressions before production and gives us a reliable “quality score” per workflow
- You establish a scalable architecture pattern for agents (permissions, audit logs, observability, fallbacks) that the team can replicate
\ Tech environment\ We’re cloud-native and move fast. Expect Python for ML/agents, TypeScript for product surfaces, Postgres for systems of record, event-driven services, and a modern LLM + retrieval stack with strong observability and CI/CD. And AWS+Azure for infra.
Why this role is special
- Build an AI-native category-defining company in a massive market
- Direct founder exposure and high leverage: your work will change the trajectory of the company
- Real breadth: growth + underwriting/claims ops + product, in one seat
- Career accelerant: if you perform, your scope and title will grow quickly
- Please ensure your profile is up to date and includes a link to your LinkedIn.
- In your application message, share 3 things you’ve built or delivered with the results you achieved in one simple sentence per example (3 sentences total).