public tool

Robots.txt Analyzer

Inspect a site's published crawl preferences before you collect pages. This tool reads robots.txt and reports directives; it does not attempt to bypass them.

preview result

https://example.com

200 OK

robots URLhttps://example.com/robots.txt

user-agent groups2 groups

allow rules1 rule

disallow rules3 rules

crawl-delay5

sitemaps1 URL

groups2 sections

Group 1

4 directives

*

crawl-delay: 51 allow2 disallow

allow

/public

disallow

/admin
/checkout

Group 2

2 directives

ExampleBot

crawl-delay: none0 allow1 disallow

allow

none declared

disallow

search notes

Read crawl policy before planning a monitor

The robots.txt analyzer fetches a public robots file, groups directives by user-agent, and highlights sitemaps, crawl-delay values, and notable warnings for compliance-aware discovery.

What it checks

Robots URL, status code, final URL, and content type
User-agent groups with allow and disallow counts
Crawl-delay values and sitemap directives
Warnings for large files, unusual responses, and parser edge cases

Use cases

Check crawl policy before monitoring catalog or listing pages
Find sitemap URLs for discovery without guessing paths
Compare allow and disallow rules across competitor sites
Document compliance considerations for a data extraction project

Limitations

Robots.txt is a public policy signal, not an access-control system.
Rules can differ by user-agent and may require human review.
This analyzer does not grant permission or override site terms.

faq

Common questions

What does robots.txt tell a scraper?

Robots.txt communicates crawl preferences such as allowed paths, disallowed paths, crawl-delay hints, and sitemap locations for different user agents.

Does robots.txt block access technically?

No. Robots.txt is a policy file. Responsible crawlers use it as an input when deciding what to crawl and how often.

Why are sitemap directives useful?

Sitemap directives reveal URL inventories that can help plan discovery, estimate coverage, and reduce unnecessary crawling.