Robots.txt

« Back to Glossary Index

Robots.txt is a text file that webmasters create to teach web robots (usually search engine spiders) how to crawl the pages of their website. 

In practice, robots.txt files indicate whether or not certain user agents (web browsers) can crawl parts of a website. These crawling instructions are specified by prohibiting or allowing the behavior of some (or all) user agents.

The robots.txt file is part of the Robots Exclusion Protocol (REP), a group of web standards that governs how robots crawl the web, access and index content, and serve it to users. The REP also includes guidelines such as meta-robots, as well as page, subdirectory or site-wide instructions for how search engines should treat links (such as “follow” or “nofollow“).

« Back to Glossary Index