Back to library

Confidence Scoring

An AI that always sounds certain is a liability. Users who can't tell the difference between a confident accurate output and a confident hallucination will eventually trust the AI when they shouldn't — and the consequences scale with the stakes. Confidence scoring is the PM mechanism for giving users the information they need to decide when to act on AI output and when to verify it. This skill designs the scoring system, the UX, and the logic that drives both.

---

Context

What confidence scoring is not:

Confidence scoring is not just exposing the model's raw probability scores. Softmax probabilities are poorly calibrated and meaningless to non-technical users. Confidence scoring is a product design decision.

The three signals that drive confidence:
SignalHow it worksBest for
Retrieval confidenceHow relevant were the retrieved source documents? (RAG features)Q&A, document search, knowledge bases
Self-assessed uncertaintyPrompt the model to flag its own uncertaintyGeneral-purpose AI, free-text generation
Downstream validationA second model call checks the output for errorsHigh-stakes factual features
The product design principle: Show confidence at the output level, not the session level.

---

Step 1 — Define the confidence scoring context

Ask:

  • What does the AI feature output?
  • What is the stakes level?
  • Does the feature use RAG?
  • What action does the user take based on the AI output?
  • What does the user do when they don't trust an output?
  • What is the user's technical sophistication?
  • Step 2 — Choose the confidence scoring approach

    Approach A — Retrieval-based confidence (for RAG features)

    Score relevance using cosine similarity thresholds: High ≥ 0.85, Medium 0.70–0.84, Low < 0.70.

    Approach B — Self-assessed uncertainty (for general features)

    Prompt the model to evaluate its own confidence as HIGH/MEDIUM/LOW with reasoning.

    Approach C — Downstream validation scoring (for Critical/High stakes)

    A second model call checks for factual errors. Adds latency and cost.

    Step 3 — Design the confidence UX

    Four patterns: colour-coded signal, inline disclaimer, source citation display, or refusal threshold.

    UX anti-patterns to avoid: Showing percentage scores, using "I think" without a systematic signal, showing confidence only in settings.

    Step 4 — Define the calibration process

    Collect 100 outputs, human-rate accuracy, build calibration table. A well-calibrated system: HIGH ≥ 90% accurate, MEDIUM 60–89%, LOW < 60%. Recalibrate on model/prompt changes and distribution shifts.

    Quality check before delivering

    Scoring approach matched to feature type
    Thresholds have numeric values
    Fallback behaviour for LOW confidence is specified
    UX pattern includes exact copy
    Calibration process is defined
    Known limitations are stated
    Suggested next step: Before implementing, run a manual calibration pass. Take 50 recent outputs, score them, and have a colleague rate them for accuracy. Fix the signal before you design the display.