Hallucination

From llmref.wiki
Hallucination — Fluent, confident model output that is factually false or unsupported by any source.

Overview

Hallucination in language models refers to generated text that is fluent and internally coherent but factually incorrect, fabricated, or unsupported by the model's training data or any cited source. The term is borrowed loosely from cognitive psychology — where hallucinations are perceptions without external stimuli — to describe model outputs that present false information with the same confidence as true information.

Hallucination is not a rare edge case; it is a structural property of how autoregressive language models generate text. Models predict the next token based on statistical patterns, not by consulting a factual database. Fluency and factual accuracy are separate properties, and a model may maximize fluency while producing false content.

The term is widely contested as imprecise: critics argue it anthropomorphizes a statistical process and conflates several distinct failure types. More specific sub-types include faithfulness failures (answer diverges from a provided source), factuality failures (answer conflicts with world knowledge), and citation hallucination (fabricated references).

Types of hallucination

Type Description Example
Factuality hallucination Stated fact is incorrect Wrong birth date for a real person
Citation hallucination Cited source does not exist or does not say what is claimed Fabricated arXiv paper DOI
Faithfulness hallucination Answer contradicts a provided document (RAG context) Summarizing a passage with an added false claim
Entity hallucination Invented entity presented as real Nonexistent company or person named confidently

Why it occurs

Hallucination emerges from the architecture of next-token prediction: the model has no symbolic truth-checking step. High-frequency training patterns reinforce plausible-sounding associations even when factually incorrect. Low-frequency facts (rare names, dates, niche references) are most vulnerable because the model has fewer training examples to anchor them.

Retrieval-augmented generation (RAG) reduces factuality hallucination by conditioning generation on retrieved documents, but introduces faithfulness hallucination if the model paraphrases or adds to the source.

Measurement

Hallucination rate is measured differently depending on subtype:

  • Factuality: human annotation or LLM-as-judge comparison against a verified knowledge base.
  • Faithfulness: NLI-based metrics (e.g., AlignScore, SummaC) or Ragas faithfulness component.
  • Citation accuracy: URL retrieval + source-claim comparison (e.g., FActScoring pipeline).

No single metric captures all hallucination types; evaluation must specify which subtype is being measured.

Distinction from related terms

  • Hallucination is not the same as groundedness failure: a model can be grounded in a source but that source can itself be wrong.
  • Hallucination is not the same as uncertainty: a model may express false statements with high expressed confidence, or true statements with hedged language.
  • Citation hallucination is a subtype, not a synonym.

See also

References