Llms.txt

From llmref.wiki
Llms.txt — A proposed site-root text file that gives AI systems a curated, machine-readable guide to a site's key content.

Overview

llms.txt is a proposed convention for a Markdown file placed at a website's root (/llms.txt) that provides large language models with a concise, curated map of the site's most important content. It was proposed in 2024 by Jeremy Howard, framed as a way to help AI systems find and use a site's primary information at inference time, given finite context windows.[1]

llms.txt is an emerging proposal rather than a ratified standard; it has no RFC or W3C status, and its adoption and interpretation by AI vendors are not uniform.

How it works

The file is human- and machine-readable Markdown with a defined structure: an H1 site title, a blockquote summary, optional notes, and H2 sections listing key URLs with short descriptions. A companion /llms-full.txt may contain expanded content. It expresses what content matters, and is distinct from access-control files that express what may be crawled.

Distinction from related terms

File Purpose Governs
llms.txt Curated content guide for AI Which content to prioritize
robots.txt Crawler access control Whether a path may be crawled
ai.txt / proposals Usage/training permissions Whether content may be used for AI training
XML sitemap Exhaustive URL inventory Discovery of all pages

llms.txt is not an access-control mechanism: it does not block crawling or training (that is the role of robots.txt and usage directives), and it does not guarantee that any model will read or honor it.

Examples

  • A documentation site publishes /llms.txt linking its API reference and key guides so AI assistants surface the canonical pages.
  • This site's own llms.txt indexes its primary reference pages.

See also

References

  1. Howard, J. (2024). "The /llms.txt file." https://llmstxt.org/