Technique Reference

Six techniques.
When to use each.

Each prompting technique targets a different failure mode. This page covers every major technique with a plain-English explanation, a copy-ready prompt template, and the known cases where it fails.

01 Zero-Shot Prompting

Zero-shot prompting gives the model a task and nothing else. No examples, no reasoning chain — just a direct instruction. It works for any task a capable model already understands from pre-training.

When to use it

Start here. Use zero-shot for straightforward tasks: translation, summarisation, classification, extraction, and simple Q&A. Only add complexity if the output fails.

Template

Classify the sentiment of the following customer review as Positive, Negative, or Neutral.

Review: "The product arrived on time but the packaging was damaged."

Sentiment:

Known failure modes

Output format inconsistent across runs — use few-shot or structured output instead
Multi-step reasoning degrades — use chain-of-thought instead
Domain-specific vocabulary errors — use role prompting instead

02 Few-Shot Prompting

Few-shot prompting provides 2–8 input/output examples before the real task. The model learns the required format and tone from the examples rather than from an explicit description.

When to use it

Use few-shot when output format must be consistent — structured labels, specific JSON keys, a particular writing style, or domain-specific terminology the model doesn't naturally produce.

Template

Extract the company name, role, and start year from each bio.

Bio: "Jane Doe joined Acme Corp as a Senior Engineer in 2019."
Output: {"company": "Acme Corp", "role": "Senior Engineer", "year": 2019}

Bio: "Bob Smith became CTO of Horizon Labs in 2021."
Output: {"company": "Horizon Labs", "role": "CTO", "year": 2021}

Bio: "Maria Chen started as a Product Manager at Nova Systems in 2023."
Output:

Selecting good examples

Examples should cover edge cases, not just typical inputs. One example with an ambiguous case is worth more than three obvious ones. Keep examples short — long few-shot prompts consume tokens fast.

Known failure modes

The model echoes the example format even when inappropriate — validate outputs
Poor examples produce worse results than zero-shot
Token cost scales with number of examples

03 Chain-of-Thought (CoT)

Chain-of-thought prompting asks the model to reason step-by-step before giving an answer. It's one of the most well-documented accuracy improvements for multi-step tasks — the phrase "Let's think step by step" alone measurably improves results.

When to use it

Use CoT for arithmetic, logic puzzles, multi-step planning, legal/medical reasoning, and any task where getting the right answer requires intermediate conclusions. It has minimal effect on simple factual lookups.

Zero-shot CoT template

A factory produces 480 units per day. They operate 5 days per week.
How many units are produced in 4 weeks?

Let's think step by step.

Few-shot CoT template

Q: Roger has 5 tennis balls. He buys 2 more cans of 3 tennis balls each.
How many tennis balls does he have now?

A: Roger starts with 5 tennis balls.
2 cans × 3 balls = 6 new balls.
5 + 6 = 11 tennis balls.
Answer: 11

Q: A train travels at 90 km/h. How long does it take to cover 315 km?

A:

Research Note

Wei et al. (2022) "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" first documented the step-by-step phrase as an effective elicitor. The effect is more pronounced on models with 100B+ parameters.

Known failure modes

Model produces plausible-looking but wrong reasoning chains — especially for real-world knowledge tasks
Longer reasoning increases token cost and latency
Limited benefit on small models (<7B parameters)

04 Role Prompting

Role prompting assigns the model a persona or domain identity before the task. This shifts the vocabulary, assumed knowledge level, and stylistic choices of the response without changing the underlying task.

When to use it

Use role prompting when you need domain-specific framing: medical, legal, financial, academic, or technical communication. Also useful to constrain tone — "respond as a terse senior engineer" removes unnecessary hedging.

Template

You are a board-certified emergency physician. A patient presents with sudden-onset
chest pain radiating to the left arm, diaphoresis, and shortness of breath.

List the three most critical immediate actions in order of priority.

Known failure modes

Role can cause the model to hallucinate domain-specific authority it doesn't have — always verify factual claims
Vague roles ("act as an expert") produce minimal benefit; specific roles ("you are a Python performance engineer at a fintech startup") work better

05 Structured Output

Structured output prompting instructs the model to respond in a machine-parseable format — JSON, CSV, XML, or a custom schema. Use it whenever the response feeds into code, a database, or another automated system.

When to use it

Any time the response will be parsed programmatically. Most modern LLM APIs support a response_format or JSON mode that enforces valid syntax — use this over prompt-level instructions when available.

Template

Extract all product mentions from the text below.
Return a JSON array where each item has: name (string), price_usd (number or null),
sentiment (one of: positive, neutral, negative).

Text: "The new AirPods Pro at $249 are excellent, but the $129 Beats Flex feels cheap."

JSON:

Combining with few-shot

For complex schemas, add one complete example before the real input. This eliminates ambiguity about which fields are required and what null values look like.

Known failure modes

Model may produce syntactically valid but semantically wrong JSON — validate with a schema
Very large schemas cause the model to omit fields — use API-level JSON mode
Nested schemas beyond 3 levels deep are unreliable without few-shot examples

06 Self-Consistency

Self-consistency generates multiple independent answers and selects the most common one. It improves accuracy by trading latency and cost for reliability — particularly useful for high-stakes outputs where a single run may be wrong.

When to use it

Use self-consistency for any classification, extraction, or reasoning task where accuracy matters more than speed. Run 3–7 completions with temperature > 0 and take the majority answer.

Implementation pattern

# Pseudocode
answers = []
for i in range(5):
    response = llm.complete(prompt, temperature=0.7)
    answers.append(parse_answer(response))

final_answer = Counter(answers).most_common(1)[0][0]

Known failure modes

Cost multiplies with N runs — only worth it for tasks with verifiable correctness
If the majority of runs share a systematic error, self-consistency amplifies the mistake
Doesn't help with creative tasks where there's no "correct" answer

Last updated: March 22, 2026

Six techniques.When to use each.

01 Zero-Shot Prompting

When to use it

Template

Known failure modes

02 Few-Shot Prompting

When to use it

Template

Selecting good examples

Known failure modes

03 Chain-of-Thought (CoT)

When to use it

Zero-shot CoT template

Few-shot CoT template

Known failure modes

04 Role Prompting

When to use it

Template

Known failure modes

05 Structured Output

When to use it

Template

Combining with few-shot

Known failure modes

06 Self-Consistency

When to use it

Implementation pattern

Known failure modes

Six techniques.
When to use each.