Production e-commerce search with proven relevance

Fine-tuned ML pipeline that delivers 28% better results than vanilla TypeSense/OpenSearch—now deployed and ready.

Benchmarked on 130K real Amazon queries (ESCI dataset):

Kwiree (fine-tuned)0.332 NDCG@10
OpenSearch Semantic + Rerank0.319 NDCG@10
OpenSearch BM25 + Rerank0.307 NDCG@10
OpenSearch Hybrid0.302 NDCG@10
OpenSearch Semantic0.296 NDCG@10
OpenSearch BM250.258 NDCG@10

Production Ready, Not Vapor

This is not a design partner program. The API is live.

What's ready:

  • Documented REST API (ingest, search, analytics)
  • Deployed on production infrastructure
  • Monitoring and alerting configured
  • 99% uptime SLA with prorated credits
  • P99 latency <150ms on production traffic

What's NOT ready yet:

  • Synonym management
  • A/B testing framework (use your own tools)
  • Advanced merchandising rules
  • 24/7 support (48hr email response, monitoring is 24/7)

Why Not Build This Yourself?

"Can't I just hire a contractor to build hybrid search?"

You could. But here's what that actually looks like:

Build it yourself:

Month 1-2:Selecting and fine-tuning embedding model
Month 3-4:Selecting and fine-tuning reranking model
Month 5-6:Integrate, debug, optimize
Ongoing:Maintain model, retrain, host, monitor

Cost: $15K-30K upfront + $3K-5K/month maintenance

Risk: You need ML expertise in-house to maintain. If contractor leaves, you're stuck. Model degrades over time without retraining.

Use Kwiree:

Week 1:Integrate API, test on staging
Week 2:Production rollout, monitor results
Ongoing:I maintain models, infrastructure, ML

Cost: $949/month (early adopter, 12 months)

Risk: Solo founder, 99% SLA, early product.
But: proven tech, no build cost, exit anytime

The question: Is saving $40-80K/year worth it?

For most teams with limited ML expertise, managed service makes more sense.

Technical Architecture

Hybrid pipeline: BM25 + fine-tuned embeddings + fine-tuned reranking, optimized for e-commerce product search.

Query:

"wireless headphones for running"

1

BM25 Retrieval

(100 candidates)

Lexical matching on titles/descriptions

p50 latency: 14ms

2

Embedding Search

(100 candidates)

Fine-tuned on e-commerce queries

p50 latency: 30ms

3

Fusion

(50 candidates)

Reciprocal rank fusion

p50 latency: 14ms

4

Reranking

(Final 10)

Fine-tuned on e-commerce queries

p50 latency: 40ms

Total latency: 100ms (120-150ms p99)

Addressing the Elephant in the Room

Solo Founder Risk: Let's talk about it.

The concerns are valid:

  • Single point of failure (what if I get hit by a bus?)
  • Limited bandwidth (support response times)
  • No redundancy (if I'm sick, things slow down)
  • Uncertain future (will this become full-time?)

Here's how I'm mitigating these risks:

1. Technical safeguards

  • Infrastructure is automated
  • Monitoring runs 24/7 (I get paged)
  • Data backups are automated and tested
  • Code is documented for potential future team

2. Exit guarantees

  • Data export API available anytime
  • First 14 days is free, evaluate, no strings attached
  • No long-term contracts (month-to-month)
  • If I shut down: 90-day notice + model handoff
  • Fine-tuned models would be open-sourced if I quit

3. Commitment path

  • Currently: Part-time (employed elsewhere)
  • At 10 customers: Full-time commitment
  • At 25 customers: Hire support engineer
  • Being transparent about growth path

4. Track record

  • Built production ML search systems at 2 companies
  • Know how to run infrastructure at scale
  • Have deployed customer-facing APIs before
  • Not my first production system

The honest assessment:

  • If you need enterprise stability → use Algolia
  • If you're okay with early-stage risk for up to 70% savings and better relevance → this might work
  • If you're risk-averse → wait 6 months, come back

I'm not hiding the risks. You're betting on:

  1. The technical approach (proven in benchmarks)
  2. My ability to operate infrastructure (proven at prev jobs)
  3. Early-stage execution risk (real, but mitigated)

FAQ

Ready to validate the claims?

Run the benchmarks yourself, or apply for a free benchmark on your catalog