Skip to content
Back to blog
·9 min read
How to ship a production AI chatbot in 14 days
AIChatbotRAGShip it

How to ship a production AI chatbot in 14 days

Fourteen days from zero to a live AI chatbot your company can actually use. The schedule we follow on every client project, down to what happens on each day.

Last verified
Mező Dezső
By Mező DezsőFounder, DField Solutions
ShareXLinkedIn#

The AI chatbot pitch is easy: drop it on your site, save the support team 20 hours a week. The reality is that most teams spend three months on this and never ship — not because it's hard, but because nobody ever wrote down the days. Here's our default 14-day schedule, every step calibrated on the last ten projects we did.

None of this is magic. Each day does one concrete thing, and if you can't do today's, you don't move to tomorrow.

Days 1–2: pick one measurable job

Don't say 'we want AI.' Say 'we want ticket volume down from 200/week to 50.' If you can't put a number on it, spend two more weeks measuring before you start. Every other decision — which data, which model, how to evaluate — flows from that one number.

Days 3–5: one folder for the knowledge

The quality of an AI chatbot is the quality of its data, not the quality of its model. Pull your FAQ, the last three months of support email threads, product specs, and pricing into one folder — Drive, Notion, doesn't matter. This is the actual moat. Spend two full days on this even if it feels boring.

Days 6–8: retrieval, not model

We build a hybrid retriever: BM25 keyword + vector + reranker. The chatbot may only answer from the retrieved chunks, always with source citations. If the retriever finds nothing, the bot refuses to answer. This is the part that lives through the next six model swaps.

# Simplified retrieval flow
from dfield.retrieval import HybridRetriever

retriever = HybridRetriever(
    bm25_weight=0.4,
    vector_weight=0.6,
    reranker="bge-reranker-v2-m3",
    refuse_below_score=0.55,
)
chunks = retriever.search(query, top_k=8)
if not chunks:
    return "I don't have an answer for that in the company docs."

Days 9–10: the eval set is the product

50 real historical questions, pulled from the actual support archive. For each, write down what a good answer looks like (one sentence). Run the bot on all 50, score each answer — pass / fail / almost. Gate the release on >85% pass. This is the step most teams skip, which is also why their bots silently rot.

If pass rate is below 85%, the fix is almost always in the knowledge folder, not the model. Hallucination usually means 'the right document wasn't in the retrieval index.'

Days 11–12: guardrails + cost routing

PII scrubber on input and output, prompt-injection detector, output schema validation. And cost routing: a small model handles easy questions (90% of traffic), a big model only the hard ones. In practice this cuts LLM spend by 3–5× with no quality impact.

Days 13–14: deploy + real-time dashboard

Deploy to the widget on your site and to Slack for the support team. Ship with a dashboard from day one: questions per day, pass rate (sampling 5% live), cost per active user, top-5 questions the bot refused. The dashboard is the thing that makes week 3 through week 20 better than week 2.

Where people lose three months

  • Fine-tuning before proving retrieval works — almost always a waste of money.
  • No eval set, so quality is 'vibes' and nobody dares to ship.
  • Only one model, locked in when prices and quality change monthly.
  • No source citations, so the support team can't verify answers quickly.
  • Ship without a dashboard, blind to what's actually going wrong.

What it actually costs

The 14 days cost €10–18k depending on scope. Monthly run costs €200–800 for most businesses (LLM usage, hosting, monitoring). If your support team spends even 15 hours a week answering repeat questions, the bot pays for itself in three months.

Want to see whether a 14-day ship is realistic for your setup? A 30-minute call is free. Tell us the measurable job, we'll tell you what's possible inside two weeks and what isn't.

ShareXLinkedIn#
Mező Dezső

By

Mező Dezső

Founder, DField Solutions

I've shipped production products from fintech to creator-tooling · for startups and enterprises, from Budapest to San Francisco.

Keep reading

RELATED PROJECTS

Would rather build together?

Let's talk about your project. 30 minutes, no strings.

Let's talk