Hallucination

Hallucination — Fluent, confident model output that is factually false or unsupported by any source.

Overview

Hallucination in language models refers to generated text that is fluent and internally coherent but factually incorrect, fabricated, or unsupported by the model's training data or any cited source. The term is borrowed loosely from cognitive psychology — where hallucinations are perceptions without external stimuli — to describe model outputs that present false information with the same confidence as true information.

Hallucination is not a rare edge case; it is a structural property of how autoregressive language models generate text. Models predict the next token based on statistical patterns, not by consulting a factual database. Fluency and factual accuracy are separate properties, and a model may maximize fluency while producing false content.

The term is widely contested as imprecise: critics argue it anthropomorphizes a statistical process and conflates several distinct failure types. More specific sub-types include faithfulness failures (answer diverges from a provided source), factuality failures (answer conflicts with world knowledge), and citation hallucination (fabricated references).

Types of hallucination

Type	Description	Example
Factuality hallucination	Stated fact is incorrect	Wrong birth date for a real person
Citation hallucination	Cited source does not exist or does not say what is claimed	Fabricated arXiv paper DOI
Faithfulness hallucination	Answer contradicts a provided document (RAG context)	Summarizing a passage with an added false claim
Entity hallucination	Invented entity presented as real	Nonexistent company or person named confidently

Why it occurs

Hallucination emerges from the architecture of next-token prediction: the model has no symbolic truth-checking step. High-frequency training patterns reinforce plausible-sounding associations even when factually incorrect. Low-frequency facts (rare names, dates, niche references) are most vulnerable because the model has fewer training examples to anchor them.

Retrieval-augmented generation (RAG) reduces factuality hallucination by conditioning generation on retrieved documents, but introduces faithfulness hallucination if the model paraphrases or adds to the source.

Measurement

Hallucination rate is measured differently depending on subtype:

Factuality: human annotation or LLM-as-judge comparison against a verified knowledge base.
Faithfulness: NLI-based metrics (e.g., AlignScore, SummaC) or Ragas faithfulness component.
Citation accuracy: URL retrieval + source-claim comparison (e.g., FActScoring pipeline).

No single metric captures all hallucination types; evaluation must specify which subtype is being measured.

Distinction from related terms

Hallucination is not the same as groundedness failure: a model can be grounded in a source but that source can itself be wrong.
Hallucination is not the same as uncertainty: a model may express false statements with high expressed confidence, or true statements with hedged language.
Citation hallucination is a subtype, not a synonym.

References

Anonymous

Search

Hallucination

Namespaces

More

Page actions

Contents

Overview

Types of hallucination

Why it occurs

Measurement

Distinction from related terms

See also

References

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Hallucination

Overview

Types of hallucination

Why it occurs

Measurement

Distinction from related terms

See also

References

Navigation

Wiki tools

Page tools

Categories