Technique Reference
Six techniques.
When to use each.
Each prompting technique targets a different failure mode. This page covers every major technique with a plain-English explanation, a copy-ready prompt template, and the known cases where it fails.
01 Zero-Shot Prompting
Zero-shot prompting gives the model a task and nothing else. No examples, no reasoning chain — just a direct instruction. It works for any task a capable model already understands from pre-training.
When to use it
Start here. Use zero-shot for straightforward tasks: translation, summarisation, classification, extraction, and simple Q&A. Only add complexity if the output fails.
Template
Classify the sentiment of the following customer review as Positive, Negative, or Neutral.
Review: "The product arrived on time but the packaging was damaged."
Sentiment:
Known failure modes
- Output format inconsistent across runs — use few-shot or structured output instead
- Multi-step reasoning degrades — use chain-of-thought instead
- Domain-specific vocabulary errors — use role prompting instead
02 Few-Shot Prompting
Few-shot prompting provides 2–8 input/output examples before the real task. The model learns the required format and tone from the examples rather than from an explicit description.
When to use it
Use few-shot when output format must be consistent — structured labels, specific JSON keys, a particular writing style, or domain-specific terminology the model doesn't naturally produce.
Template
Extract the company name, role, and start year from each bio.
Bio: "Jane Doe joined Acme Corp as a Senior Engineer in 2019."
Output: {"company": "Acme Corp", "role": "Senior Engineer", "year": 2019}
Bio: "Bob Smith became CTO of Horizon Labs in 2021."
Output: {"company": "Horizon Labs", "role": "CTO", "year": 2021}
Bio: "Maria Chen started as a Product Manager at Nova Systems in 2023."
Output:
Selecting good examples
Examples should cover edge cases, not just typical inputs. One example with an ambiguous case is worth more than three obvious ones. Keep examples short — long few-shot prompts consume tokens fast.
Known failure modes
- The model echoes the example format even when inappropriate — validate outputs
- Poor examples produce worse results than zero-shot
- Token cost scales with number of examples
03 Chain-of-Thought (CoT)
Chain-of-thought prompting asks the model to reason step-by-step before giving an answer. It's one of the most well-documented accuracy improvements for multi-step tasks — the phrase "Let's think step by step" alone measurably improves results.
When to use it
Use CoT for arithmetic, logic puzzles, multi-step planning, legal/medical reasoning, and any task where getting the right answer requires intermediate conclusions. It has minimal effect on simple factual lookups.
Zero-shot CoT template
A factory produces 480 units per day. They operate 5 days per week.
How many units are produced in 4 weeks?
Let's think step by step.
Few-shot CoT template
Q: Roger has 5 tennis balls. He buys 2 more cans of 3 tennis balls each.
How many tennis balls does he have now?
A: Roger starts with 5 tennis balls.
2 cans × 3 balls = 6 new balls.
5 + 6 = 11 tennis balls.
Answer: 11
Q: A train travels at 90 km/h. How long does it take to cover 315 km?
A:
Wei et al. (2022) "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" first documented the step-by-step phrase as an effective elicitor. The effect is more pronounced on models with 100B+ parameters.
Known failure modes
- Model produces plausible-looking but wrong reasoning chains — especially for real-world knowledge tasks
- Longer reasoning increases token cost and latency
- Limited benefit on small models (<7B parameters)
04 Role Prompting
Role prompting assigns the model a persona or domain identity before the task. This shifts the vocabulary, assumed knowledge level, and stylistic choices of the response without changing the underlying task.
When to use it
Use role prompting when you need domain-specific framing: medical, legal, financial, academic, or technical communication. Also useful to constrain tone — "respond as a terse senior engineer" removes unnecessary hedging.
Template
You are a board-certified emergency physician. A patient presents with sudden-onset
chest pain radiating to the left arm, diaphoresis, and shortness of breath.
List the three most critical immediate actions in order of priority.
Known failure modes
- Role can cause the model to hallucinate domain-specific authority it doesn't have — always verify factual claims
- Vague roles ("act as an expert") produce minimal benefit; specific roles ("you are a Python performance engineer at a fintech startup") work better
05 Structured Output
Structured output prompting instructs the model to respond in a machine-parseable format — JSON, CSV, XML, or a custom schema. Use it whenever the response feeds into code, a database, or another automated system.
When to use it
Any time the response will be parsed programmatically. Most modern LLM APIs support a response_format or JSON mode that enforces valid syntax — use this over prompt-level instructions when available.
Template
Extract all product mentions from the text below.
Return a JSON array where each item has: name (string), price_usd (number or null),
sentiment (one of: positive, neutral, negative).
Text: "The new AirPods Pro at $249 are excellent, but the $129 Beats Flex feels cheap."
JSON:
Combining with few-shot
For complex schemas, add one complete example before the real input. This eliminates ambiguity about which fields are required and what null values look like.
Known failure modes
- Model may produce syntactically valid but semantically wrong JSON — validate with a schema
- Very large schemas cause the model to omit fields — use API-level JSON mode
- Nested schemas beyond 3 levels deep are unreliable without few-shot examples
06 Self-Consistency
Self-consistency generates multiple independent answers and selects the most common one. It improves accuracy by trading latency and cost for reliability — particularly useful for high-stakes outputs where a single run may be wrong.
When to use it
Use self-consistency for any classification, extraction, or reasoning task where accuracy matters more than speed. Run 3–7 completions with temperature > 0 and take the majority answer.
Implementation pattern
# Pseudocode
answers = []
for i in range(5):
response = llm.complete(prompt, temperature=0.7)
answers.append(parse_answer(response))
final_answer = Counter(answers).most_common(1)[0][0]
Known failure modes
- Cost multiplies with N runs — only worth it for tasks with verifiable correctness
- If the majority of runs share a systematic error, self-consistency amplifies the mistake
- Doesn't help with creative tasks where there's no "correct" answer
Last updated: