Skip to content

Standards

RFC 9309 (Robots Exclusion Protocol)

The IETF standard that formally specifies robots.txt parsing and behavior.

Definition

Published 2022. Codifies the long-de-facto robots.txt format: User-agent grouping, Allow/Disallow rules, longest-match precedence. Defines that robots.txt blocks crawling but not indexation — a page reachable via external link can still be indexed despite a Disallow rule, unless also blocked by noindex.

Need this concept applied to your stack?

Glossary entries are intentionally short. Real engineering tradeoffs need a scoping call — bring the domain, the stack, and the question.