Least-to-most prompting

From llmref.wiki
Least-to-most prompting — Decomposing a difficult problem into progressively harder sub-problems, solved sequentially from simplest to most complex.

Overview

Least-to-most prompting is a prompt engineering strategy that breaks down a complex task into a chain of simpler sub-problems, ordered by difficulty or dependency. The language model solves each sub-problem in sequence, using earlier solutions as context for subsequent steps. This approach leverages the model's ability to handle step-by-step reasoning while reducing failure modes associated with attempting difficult problems directly.

The method is particularly effective for tasks that have clear decomposable structure—such as mathematical reasoning, algorithmic problem-solving, and multi-step logical inference. By ensuring the model succeeds on easier instances first, least-to-most prompting provides concrete intermediate outputs that improve the likelihood of correct final results.

This technique differs from undirected prompt chaining in its explicit ordering principle: steps are arranged from low to high difficulty rather than by task workflow alone. It also complements chain-of-thought prompting, which encourages step-by-step reasoning within a single problem, by providing a structural framework for multi-problem decomposition.

How it works

Least-to-most prompting operates through three main phases:

Problem decomposition: The initial task is analyzed to identify logical sub-problems that can be ordered by complexity. Simple sub-problems are those the model is highly likely to solve correctly in isolation; harder ones depend on solutions to earlier steps or require more advanced reasoning.

Sequential solving: The model is prompted to solve sub-problems in ascending order of difficulty. The prompt includes solutions to all previously solved sub-problems as context. This sequential dependency ensures that each step builds on verified intermediate results rather than relying on a single end-to-end attempt.

Context propagation: Earlier outputs are incorporated into the prompt context for subsequent sub-problems. This differs from in-context learning with static examples, as the context becomes problem-specific and dynamically constructed during the solving process.

A concrete implementation typically uses a two-stage structure: a planning stage that decomposes the problem, and an execution stage that solves each sub-problem with accumulated context. The model may generate the decomposition itself, or it may be provided by the prompt designer.

Distinction from related terms

Term Distinction
Chain-of-thought prompting CoT encourages reasoning steps within a single problem. Least-to-most decomposes multiple problems and solves them in order of difficulty. CoT focuses on transparency of reasoning; least-to-most focuses on progressive problem reduction.
Prompt chaining Prompt chaining executes sequential prompts in a predetermined workflow order. Least-to-most specifically orders by difficulty and uses each solution as mandatory context for the next step. Chaining does not require difficulty ordering.
Query fan-out Fan-out generates multiple parallel solutions or interpretations. Least-to-most generates sequential, dependent solutions where earlier outputs inform later steps. Fan-out diverges; least-to-most converges toward a final answer.
In-context learning ICL uses static examples in the prompt context. Least-to-most uses dynamically generated intermediate solutions as context. ICL examples are fixed; least-to-most context is problem-specific and accumulates during execution.

Examples

Mathematical problem-solving: A task requiring multi-step arithmetic or algebra can be decomposed into simpler equations. The model first solves sub-expressions or foundational equations, then uses those results to solve more complex composite equations. For instance, solving "2x + (3x + 5) = 20" might decompose into simplifying (3x + 5), then combining like terms, then isolating x.

Programming synthesis: Complex code generation tasks can decompose into simpler function implementations. A model solving a graph traversal algorithm might first implement a basic vertex-visit function, then build a depth-first search around it, then handle cycle detection. Each step is verified before proceeding to the next.

Logical reasoning over knowledge: Multi-hop reasoning tasks where answering the final question requires answering intermediate questions in sequence. For example, "Who is the child of the person born in London who founded TechCorp?" might decompose into: (1) Who founded TechCorp? (2) Where was that person born? (3) Who is their child? Answers are retrieved and chained in order.

See also

References