How to Block GPTBot in robots.txt (OpenAI's Crawler)

The exact rule, and what it does

The full block is two lines: 'User-agent: GPTBot' on the first line and 'Disallow: /' on the second. The user-agent line names OpenAI's training crawler exactly, and the disallow line with a single slash means 'all paths'. Place this block in a plain text file named robots.txt at the root of your domain, for example example.com/robots.txt, so it is the first thing any crawler reads.

robots.txt is a request, not a wall. Compliant crawlers like GPTBot read it and obey it, and OpenAI states that GPTBot respects these directives. But the file does not physically stop anything, so for bots that ignore robots.txt entirely you still want a firewall or rate limiting in front of your site. Think of robots.txt as the polite, well-behaved layer that handles the crawlers that play by the rules.

GPTBot vs OAI-SearchBot vs Googlebot

OpenAI runs more than one crawler, and they do different jobs. GPTBot collects content for model training. OAI-SearchBot powers ChatGPT's search feature, surfacing live results and citations. They are separate user-agents, so blocking GPTBot does not necessarily block OAI-SearchBot. If your goal is to stay out of training but remain quotable in ChatGPT search, block GPTBot and leave OAI-SearchBot allowed. If you want neither, block both by name.

Googlebot is a third, unrelated thing. It is Google's search crawler, and your ranking depends on it. Training tokens like GPTBot and Google-Extended are completely separate from search crawlers like Googlebot and Bingbot. That means you can block every AI training bot you like and your search ranking is untouched. robot·guard makes this explicit: it whitelists the legitimate search and archive crawlers while blocking the training agents, so you never accidentally hide yourself from Google.

Blocking only part of your site

You do not have to block everything. To keep GPTBot out of a specific area, point the disallow at a path instead of the root. For example, 'User-agent: GPTBot' followed by 'Disallow: /blog/' lets GPTBot crawl the rest of your site but keeps it out of /blog/. You can stack multiple Disallow lines under the same user-agent to cover several sections.

This is useful when you want marketing or documentation pages to stay machine-readable but want premium, original, or paywalled content held back from training. robot·guard lets you add these custom allow and disallow path rules on top of its curated bot list, then preview the exact file live and download it for your site root, so you can see precisely what GPTBot will and will not be told before you ship it.

how it works

01
Open or create robots.txt
Create a plain text file named robots.txt, or open your existing one. It must live at your domain root so it resolves at example.com/robots.txt. In robot·guard you start in the free editor with no file needed.
02
Add the GPTBot block
Add two lines: 'User-agent: GPTBot' then 'Disallow: /'. To block only part of the site, replace the slash with a path like '/blog/'. In robot·guard, toggle GPTBot in the curated AI crawler list to insert this for you.
03
Add related OpenAI and AI agents
If you also want out of ChatGPT search and other AI tools, add OAI-SearchBot, ClaudeBot, CCBot, and others. robot·guard keeps GPTBot and the related agents in a maintained list so you do not have to track new user-agents by hand.
04
Preview and deploy
Preview the exact file, confirm Googlebot and Bingbot are still allowed, then download robots.txt and upload it to your site root. Re-test by visiting example.com/robots.txt in a browser.

frequently asked

Does blocking GPTBot hurt my Google ranking?

No. GPTBot is OpenAI's training crawler and is entirely separate from Googlebot, which handles search. Blocking GPTBot has zero effect on how Google indexes or ranks your pages.

Is GPTBot the same as the bot behind ChatGPT search?

No. ChatGPT search uses OAI-SearchBot, a different user-agent. Blocking GPTBot stops training collection but does not automatically block OAI-SearchBot. Block both by name if you want neither.

Will OpenAI actually obey the rule?

OpenAI publicly documents that GPTBot respects robots.txt directives, so a compliant crawl will honour it. Because robots.txt is a request and not enforcement, pair it with a firewall for bots that ignore the standard.

Can I block GPTBot from just one folder?

Yes. Under 'User-agent: GPTBot', use a path like 'Disallow: /blog/' instead of '/'. You can add several Disallow lines to cover multiple sections while leaving the rest crawlable.

Last updated June 9, 2026

How to block GPTBot in robots.txt

The exact rule, and what it does

GPTBot vs OAI-SearchBot vs Googlebot

Blocking only part of your site

how it works

Open or create robots.txt

Add the GPTBot block

Add related OpenAI and AI agents

Preview and deploy

frequently asked

more on robot·guard

ready to try robot·guard?