Hallucinated citation
Overview
A hallucinated citation is a bibliographic reference generated by a language model that does not correspond to an actual published work, misattributes content to a real author, or presents fabricated publication details such as titles, venues, or dates. This phenomenon occurs because language models generate text based on statistical patterns in training data rather than retrieving verified information from databases or the internet.
Hallucinated citations differ from simple citation formatting errors. They represent fundamentally false attributions where the source material does not exist, the author did not write the attributed work, or the cited text does not contain the claim being attributed to it. This creates a particular credibility hazard in academic, legal, and professional contexts where citations serve as evidence trails.
The prevalence of hallucinated citations in large language models has been documented across multiple model architectures and scales. Unlike other forms of hallucination, hallucinated citations carry implicit epistemic authority—the citation format itself signals verifiability, even when the source does not exist. This asymmetry between form and substance poses distinct downstream risks for knowledge validation and reproducibility.
How it is measured
Hallucinated citations are typically evaluated through manual annotation or automated comparison against verified databases. Evaluation protocols include:
- **Citation verification**: Cross-referencing generated citations against digital libraries (Google Scholar, PubMed, arXiv, JSTOR) to confirm existence and attribution accuracy.
- **Content matching**: Retrieving cited sources and verifying that the attributed quotation or claim actually appears in the source material.
- **Annotation-based metrics**: Human reviewers classify citations as correct, hallucinated (fully fabricated), or distorted (real source, incorrect attribution or paraphrasing).
No standardized benchmark for hallucinated citation rates exists across model families, though studies have reported rates ranging from 5% to 40% depending on model size, domain, and citation density in prompts. Some evaluation frameworks treat hallucinated citations as a subclass of hallucination and measure them within broader factuality assessments.
| Term | Distinction |
|---|---|
| Hallucination | Hallucinated citations are a specific form of hallucination affecting citations only. General hallucinations may involve factual errors in generated text that are not cited. |
| Confabulation | Confabulation refers to unconscious memory distortion in humans. Hallucinated citations are systematic model artifacts without subjective intent or memory involvement. |
| Citation distortion | Citation distortion involves real sources cited with incorrect attribution, quotes, or context. Hallucinated citations reference sources that do not exist. |
| False attribution | False attribution is any incorrect claim about authorship or source. Hallucinated citations are false attributions specific to bibliographic references. |
Examples
- **GPT-3.5 legal domain study (2023)**: Researchers found that when asked to cite specific U.S. case law, the model generated citations to nonexistent court decisions with plausible-sounding case names and docket numbers. Verification against federal and state case databases confirmed these cases did not exist.
- **Biomedical literature evaluation**: A language model generated a citation to "Smith et al. (2019), *Journal of Molecular Medicine*, vol. 45, pp. 234–241" supporting a drug interaction claim. The journal volume and issue did not exist; the paper was not found in PubMed or PubMed Central.
- **Wikipedia reference contamination**: Studies of language models trained partially on Wikipedia revision histories documented instances where models generated citations to Wikipedia articles themselves in supporting text, or cited nonexistent Wikipedia revision dates, creating circular or false documentation chains.
See also
References
- ↑ Ray, Partha Pratim. "ChatGPT and Beyond: The Generative AI Revolution." *International Journal on Human–Computer Interaction*, vol. 1, no. 1, 2023, pp. 1–20.
- ↑ Zhang, Yue et al. "Evaluating Factuality in Generation with Dependency-level Entailment." *Findings of the Association for Computational Linguistics: ACL 2023*, 2023.