robots.txt Generator

Build your robots.txt file with a visual form. Add user-agent rules, allow and disallow paths, and use presets like Block AI Bots. Copy or download the result. Everything runs locally in your browser.

What a robots.txt file does (and doesn't do)

A robots.txt file sits at the root of your domain (https://example.com/robots.txt) and tells crawlers which paths they may fetch. Google, Bing, and most engines check it before crawling. A missing file won't cause a penalty, but a careless one can accidentally block your whole site — surprisingly common right after a staging site goes live.

Crucial distinction: Disallow blocks crawling, not indexing. If Google has already indexed a URL and you later Disallow it, the page won't drop from results on its own. To remove a page you need a noindex meta tag or HTTP header — and Google must be able to crawl the page to see it. Blocking crawling of a page you want removed is a classic mistake.

Once generated, declare your sitemap with a Sitemap: directive — build one with the Sitemap Generator. Then run a full check with the SEO Analyzer, which verifies your robots.txt is reachable and well-formed. For the wider context, see the complete on-page SEO checklist.

How to use this tool

Choose a preset (Allow All, Block AI Bots, Block Everything) or start from scratch.
Add user-agent rules: * for all bots, or named crawlers like Googlebot or GPTBot.
Add Disallow paths for anything you don't want crawled (e.g. /admin/, /staging/).
Add an Allow rule if you need to open a specific path inside a disallowed directory.
Add your Sitemap URL at the bottom, then copy or download and upload to your domain root.

Common rules and when to use them

Block AI training crawlers. Add User-agent: GPTBot with Disallow: / (and CCBot, anthropic-ai, etc.) to keep your content out of model training. The Block AI Bots preset handles it.
Block admin areas. Disallow: /wp-admin/ or /admin/ keeps login pages out of the index.
Block internal search. Result pages like /search?q= create thin duplicate content — disallow the search path.
Reference your sitemap. Add Sitemap: https://example.com/sitemap.xml so every crawler finds it, not just the ones you submit to manually.

Frequently asked questions

Where do I upload my robots.txt file?

It must live at the root of your domain: https://yourdomain.com/robots.txt — it cannot be in a subdirectory. On most servers that means the public root (public_html, www, or dist). On Cloudflare Pages, Vercel, and Netlify, place it in /public or /static and it is served from the root automatically.

Does Disallow remove a page from Google's index?

No. Disallow blocks crawling — it stops Google fetching the page. But an already-indexed URL (or one found via a sitemap or backlink) can stay in the index. To remove a page from results, use a noindex meta tag or HTTP header, and make sure the page is NOT blocked from crawling so Google can read that tag.

How do I block AI crawlers in robots.txt?

Add a separate User-agent block for each AI bot you want to block — GPTBot (OpenAI), CCBot (Common Crawl), anthropic-ai, Google-Extended, and PerplexityBot are the common ones — with Disallow: / under each. The generator’s "Block AI Bots" preset writes these rules for you.

Can I have more than one User-agent block?

Yes. You can stack multiple User-agent lines before a shared set of rules, or write separate blocks per bot. Googlebot, Bingbot, and GPTBot each follow their own block; use * as a catch-all for bots not explicitly listed.

How do I check that my robots.txt is working?

Google Search Console has a robots.txt report under Settings. You can also just visit https://yourdomain.com/robots.txt in a browser. To verify a specific page is crawlable after editing, use the URL Inspection tool in Search Console.