Prompt chaining
Overview
Prompt chaining is a technique in prompt engineering where the output of one language model invocation serves as input to a subsequent prompt, creating a multi-step pipeline to solve complex problems. Rather than attempting to solve an entire task in a single prompt, prompt chaining breaks the task into intermediate steps, each handled by a separate prompt-completion cycle.
This approach leverages the sequential reasoning capabilities of LLMs by decomposing cognitively difficult tasks into more tractable subtasks. Each chain link may involve transformation, filtering, validation, or extraction of information from the previous step. Prompt chaining differs from simpler in-context learning because it involves explicit iteration cycles and explicit transition of outputs between distinct prompts, rather than embedding all instructions and examples within a single prompt.
The technique is foundational to agentic workflows, multi-agent orchestration, and systems that require structured reasoning or quality gates between reasoning steps. Prompt chaining can be implemented deterministically (with fixed routing) or conditionally (where intermediate outputs determine which prompt executes next), and is often combined with retrieval-augmented generation or external tools to ground outputs in facts.
How it works
Prompt chaining operates via a sequence of stages, each executing a distinct prompt and capturing its output for use in a subsequent stage:
- Input preparation: The initial query or data enters the first prompt. This prompt may be tasked with clarifying the input, decomposing it, or performing preliminary analysis.
- Intermediate processing: The output from stage 1 is parsed and formatted as input to stage 2. This may involve structured extraction (e.g., chunked text, JSON fields, or key-value pairs) to ensure the output is compatible with the next prompt's expectations.
- Iterative refinement: Subsequent prompts apply specialised transformations—such as consistency checking, query rewriting, self-reflection, or critical evaluation—to outputs from prior stages.
- Termination and output: The final prompt in the chain produces the end-user-facing output, or a guardrail filter validates it before return.
Implementation typically requires explicit state management: storing intermediate outputs, handling hallucination or failure in any stage, and managing context window constraints across multiple API calls. Some chains incorporate automated evaluation or human evaluation gates between steps to verify quality before proceeding.
Prompt chaining can be visualized as a directed acyclic graph (DAG) where nodes are prompts and edges represent data flow. Conditional logic (e.g., branching based on intermediate output) extends this to more complex orchestration patterns.
| Term | Distinction |
|---|---|
| Chain-of-thought | Chain-of-thought occurs within a single prompt, using structured reasoning steps (step 1, step 2, ...) to improve reasoning within one completion. Prompt chaining involves multiple separate prompts and explicit output capture between them, enabling modularity and state management across API calls. |
| In-context learning | In-context learning embeds all examples and instructions within a single prompt to guide behavior. Prompt chaining executes separate prompts sequentially, allowing intermediate outputs to be validated, filtered, or transformed before feeding to the next prompt. |
| Agentic workflow | Agentic workflows typically involve loops, tool use, and goal-directed iteration with planning. Prompt chaining is a linear or acyclic sequence of prompts without planning or looping logic; it is a component of agentic systems but not inherently agentic. |
| Multi-agent orchestration | Multi-agent orchestration coordinates multiple distinct LLM agents with separate memories, identities, or capabilities. Prompt chaining sequences prompts (which may or may not be agents) in a predetermined pipeline; agents are optional. |
| Retrieval-augmented generation (RAG) | RAG augments a single prompt with retrieved external documents before generation. Prompt chaining uses multiple prompts; a chain can incorporate RAG at any step, but they address different problems (context enrichment vs. task decomposition). |
Examples
Document summarization with verification pipeline: A system summarizes a long document in the first prompt, then passes the summary to a second prompt that identifies key entities and claims. A third prompt cross-checks those claims against the original document to detect hallucinated facts. Each stage produces structured output (summary JSON, entity list, verification report) that feeds the next.
Customer support ticket routing and response: An initial prompt classifies an incoming support ticket by urgency and category. The classification output is then used to route to a specialized second prompt trained on that category's common resolutions. A third prompt generates a polished response using the second prompt's output, and a final critic prompt evaluates it for tone and accuracy before sending to the user.
Question answering with clarification: A first prompt analyzes an ambiguous user question and generates clarification candidates or disambiguating sub-questions. A second prompt either asks the user for clarification (if ambiguity is high) or proceeds to retrieve relevant documents via RAG. A third prompt synthesizes the retrieved context with the clarified question to produce the final answer, with an optional fourth prompt performing consistency checking against the retrieved sources.
See also
- Chain-of-thought
- Agentic workflow
- Orchestration pattern
- Prompt engineering
- Self-reflection (AI)
- Retrieval-augmented generation
- Guardrails