Prompting Strategy Selector
Choosing the wrong prompting strategy wastes engineering time and produces worse outputs. The right choice depends on what you're asking the model to do, what data you have, and what quality and cost constraints you're working within.
---
Context
The four main strategies:| Strategy | What it is | When it works |
|---|---|---|
| Zero-shot | Give the model a task with no examples | Simple, well-defined tasks within the model's training |
| Few-shot | Give the model 2–5 examples of the desired input/output | Tasks where format, tone, or domain-specific output matters |
| Chain-of-thought (CoT) | Instruct the model to reason step by step | Complex reasoning, multi-step decisions |
| RAG | Retrieve relevant documents and include in context | Tasks requiring up-to-date, specific, or proprietary knowledge |
---
Step 1 — Understand the task requirements
Assess: task type, data needed, output consistency requirements, volume, cost tolerance, and availability of labelled examples.
Step 2 — Run the decision tree
Step 3 — Deep-dive on each strategy
Zero-shot: Best for simple tasks. Failure signal: inconsistent quality → move to few-shot. Few-shot: 3–5 diverse examples as input/output pairs. Place examples BEFORE the task. Failure signal: model follows some examples but not others → add diversity or consider fine-tuning. Chain-of-thought: Add "Think through this step by step." Separate reasoning from output. Don't use for simple classification/extraction tasks (adds latency). Failure signal: correct reasoning but wrong conclusion → add checkpoints. RAG: Retrieve 3–5 chunks per query with similarity threshold. Define fallback for no relevant chunks. Failure signal: model ignores context → strengthen grounding instruction.Step 4 — Compare strategies for the specific task
Run a comparison across: quality, consistency, setup time, ongoing cost, latency, knowledge freshness, hallucination risk.
Quality check before delivering
Decision tree was followed — not a generic recommendation
Comparison table is filled in for the specific task
Recommendation includes a starter prompt to test
"When to reconsider" criteria are specific
Fine-tuning is only recommended if 100+ labelled examples exist
Suggested next step: Build and test the starter prompt with 5–10 real inputs before briefing engineering. The strategy is a hypothesis until it's tested.