robot.guard
smart robots.txt that pays for itself
robot.guard turns the most overlooked file on your site into a control panel. Instead of hand-editing robots.txt and hoping you got the syntax right, you tick the legitimate crawlers you want to keep — Googlebot, Bingbot, the Internet Archive — and switch off the AI scrapers that crawl hardest and give back least. Every choice writes a valid, standards-compliant directive for you.
Underneath the friendly toggles is a real editor: curated allow-lists, a maintained blocklist of known AI user-agents like GPTBot, CCBot, and ClaudeBot, plus your own custom allow/disallow rules. You preview the exact file as you go and download it ready to drop at your site root — so the bots that cost you money stop crawling, and the ones that send you traffic keep coming.
how it works
- 01
whitelist the good bots
Tick the crawlers you want — search, social, archives — from a curated list.
- 02
block the scrapers
Switch off known AI scrapers, or add your own user-agent and path rules.
- 03
generate & download
Preview the exact robots.txt, then download it to drop at your site root.
a look inside
a few of the screens you'll actually use.
- whitelist the good bots
- block ai scrapers
- generate & download
curated ai scraper blocklist
kept current as new crawlers appear — toggle the ones you want shut out.
robot.guard guides
Ways to use robot.guard, and how it compares.
- use caserobot.guard: one place to whitelist the bots you want and block the ones you don'tA deep look at robot.guard — the robots.txt manager that turns a fragile text file into a control panel. Whitelist Googlebot and the crawlers that send you traffic, block AI scrapers like GPTBot and CCBot, preview the exact file, and download it in a click.
- use caseWhat is robots.txt, and why it matters more than everrobots.txt is a plain-text file at your site root that tells crawlers which parts of your site they may visit. Here is what it does, what it can't do, and why the rise of AI scrapers makes it worth a second look.
- how toHow to block specific AI bots from scraping your websiteAI crawlers like GPTBot, ClaudeBot, CCBot and Google-Extended harvest your content for training. Here's how to block them in robots.txt by user-agent — and how robot.guard keeps the list current for you.
- use caseThe hidden cost of unwanted bot traffic — and how AI scrapers inflate itBots are roughly half of all web traffic, and AI scrapers crawl harder than search engines while sending nothing back. Here's how that translates into real bandwidth, compute, and hosting costs — and how to cut it.
- how torobots.txt for SEO: how to whitelist Googlebot without locking out the restA robots.txt mistake can quietly tank your SEO. Here's how to make sure Googlebot, Bingbot and essential crawlers can reach what they should — while still blocking the bots that only cost you.
- comparisonrobots.txt vs firewall: choosing the right bot protectionrobots.txt politely asks compliant bots to stay out; a firewall forcibly blocks any request. Here's how they differ, when each one is the right tool, and why most sites need both.
- how toA developer's guide to robots.txt rules that don't bite you laterUser-agent matching, Allow vs Disallow precedence, wildcards, and the gotchas that quietly break crawling. A practical robots.txt reference for developers, plus how to keep the file under version-controlled order.
- how toHow to generate an intelligent robots.txt for the modern webA modern robots.txt has to do two jobs: welcome the search and social crawlers you want, and turn away the AI scrapers you don't. Here's how to generate one that does both — without hand-writing a line.
- how toHow to protect your content from AI training with robots.txtMost AI training crawlers honour an opt-out in robots.txt. Here's how to keep your writing, images and data out of training datasets by blocking GPTBot, CCBot, Google-Extended and the rest — while staying in search.
- use caseSmall business website security: robots.txt strategies that actually helpYou don't need an enterprise security team to control who crawls your site. Here are the practical robots.txt basics for a small business — what to allow, what to block, and the mistakes to avoid.
- how toHow to block GPTBot in robots.txtBlock GPTBot, OpenAI's training crawler, with two lines in robots.txt. Learn the exact rule, how GPTBot differs from OAI-SearchBot, and how to keep search intact.
- how toHow to block ClaudeBot and anthropic-aiBlock Anthropic's crawlers, ClaudeBot and anthropic-ai, in robots.txt. Get the exact rules, both user-agents, and confirmation it won't touch your search ranking.
- how toHow to block CCBot (Common Crawl)Block CCBot, the Common Crawl bot, with two lines in robots.txt. Learn why blocking one crawler cuts off the dataset that trains many downstream AI models.
- use caseWhat is Google-Extended (and should you block it)?Google-Extended is Google's separate AI-training crawler token. Blocking it opts you out of Gemini and Vertex AI training while keeping full Google Search ranking.
- use caseDo AI bots actually respect robots.txt?An honest look at whether AI crawlers obey robots.txt. The major identifiable ones honour it; some ignore it. It works for the compliant majority, not as a wall.
- use caseThe AI crawler user-agents to know in 2026A reference rundown of the AI crawler user-agents in 2026, grouped by operator, with what each one is for and why a maintained block list beats a static copy-paste.
- how toHow to edit robots.txt in WordPressWordPress serves a virtual robots.txt with no physical file. Learn how to override it, edit rules, and add a robot.guard set that allows search and blocks AI bots.
- how toRobots.txt on Shopify: what you can and can't changeShopify auto-generates robots.txt, but the robots.txt.liquid template now lets you add rules, including AI-scraper blocks. Learn what you can change and the limits.
- comparisonrobots.txt vs noindex: which one keeps a page out of Google?robots.txt controls crawling, noindex controls indexing. Learn why blocking a page in robots.txt can leave a bare URL in Google, and which to use when.
- comparisonrobots.txt vs llms.txt: do you need both?robots.txt blocks the crawlers you don't want; llms.txt helps the AI engines you do want understand your site. They're complementary — here's how they differ.