One Manufacturer, Three Bots: What You Need to Know About Anthropic and OpenAI's Separate Crawlers

Anthropic operates three distinct bots, and OpenAI does the same — each one serving a different function. If your robots.txt file only blocks or allows one of them, you might unknowingly cut yourself off from an entire capability without realizing it. You could permit ChatGPT's training bot while blocking its search bot, meaning you'll never appear in ChatGPT's live answers even though you're technically "open." This post breaks down what each of the six bots does — and how to control them deliberately.

This isn't just a technical footnote. The six bots touch your customers at different points: one determines whether ChatGPT even knows you exist; another determines whether it cites you as a source in a live response. Before you adjust a single robots.txt setting, it's worth understanding exactly what you gain with each permission — and what you lose with each block.

Why isn't it enough to know just the "main bot"?

Many people think of AI bots as though each company runs a single, monolithic crawler across the internet. In reality, both Anthropic and OpenAI operate multiple bots that are functionally separate — each with its own name and User-agent string, making them individually controllable in your robots.txt file.

Confusing them can have real consequences. If someone knows only the "ClaudeBot" name and blocks it because they think "I don't want Anthropic mapping my site," they're actually only stopping training data collection — while Claude's search bot, Claude-SearchBot, keeps coming through. The inverse problem happens too: someone might allow all Anthropic bots but mistake "GPTBot" for "all OpenAI bots," not realizing ChatGPT's search and user-context bots have different names entirely. The actual impact stays invisible until someone measures it, and the numbers reveal what happened.

The six bots each have their own logic, and the distinction matters because it mirrors how model training is separated from user experience:

  • Training bot — it comes to add your page's content to the next model's training material. The effect takes months or even years to appear.
  • Search/response bot — it comes to cite your page as a source in real-time answers. The effect is immediate: today's question, today's answer.
  • User-context bot — Anthropic's special bot, which handles link sharing inside the Claude app. When a user pastes a URL into Claude, this bot reads the page on the user's behalf.

These three functions represent three different values for you — and three separate decisions, not one.

Anthropic's Three Bots: ClaudeBot, Claude-User, Claude-SearchBot

According to Anthropic's own documentation, they operate these three bots:

ClaudeBot — collects training data for Claude models. When ClaudeBot visits your page, the content it finds may become part of future Claude models' training material. The effect isn't immediate: training cycles take months. If you block ClaudeBot, your content won't influence Claude's future "knowledge" about you — but this block doesn't affect the other two bots.

Claude-User — handles user link sharing. When someone pastes a link into Claude and says "read this page," Claude-User goes and reads it on the user's behalf, in real-time. This is technically a proxy request: not Claude's servers acting on their own, but rather responding to the user's request. Blocking it means links pasted into Claude won't be readable — the user won't get a summary of your page's content, even if they want to show it to someone else.

Claude-SearchBot — performs live search and response citation. This is the bot that comes so Claude can cite your page as a source in real-time, when a user asks something your content answers. Of the three, this one directly impacts buyer visibility: if this bot can access your site and your page offers citable content, you can appear in Claude's live responses. Block it, and that opportunity disappears — regardless of whether ClaudeBot is allowed.

The most common Anthropic mistake: someone blocks ClaudeBot — reasonably, if they don't want their content in training data — but expects Claude to cite their page anyway. This isn't possible: training and live citation are two different bots. Blocking ClaudeBot alone doesn't affect Claude-SearchBot's access — but if your robots.txt contains a broad User-agent: * block, it locks out all three at once.

OpenAI's Three Bots: GPTBot, OAI-SearchBot, ChatGPT-User

OpenAI's documentation distinguishes three bots with the same structural logic:

GPTBot — collects training data for GPT models. It's the exact counterpart to ClaudeBot: it comes so your page's content can feed into future GPT model training. When it launched in 2023, GPTBot became the symbolic target of AI bot blocking, which is why many people only know this one. If you block only this one, all other OpenAI bots keep working.

OAI-SearchBot — performs ChatGPT's live search and source-citation. This is the bot that enables your page to appear in ChatGPT's live answers when a user asks something in real-time. From a buyer-visibility standpoint, this is OpenAI's most critical bot: if someone asks ChatGPT today where to turn for your service, and ChatGPT answers with live search, OAI-SearchBot is gathering the sources. Block this bot, and your page is excluded from that answer — even if GPTBot came through and learned from you for years.

ChatGPT-User — like Claude-User, a user-context bot. When someone pastes a URL into ChatGPT, ChatGPT-User reads your page on their behalf. Blocking it means ChatGPT can't process links pasted into it — ChatGPT can't generate a quick summary of your page if a user asks it to.

The most common OpenAI mistake: someone blocks GPTBot — again, reasonably if they oppose training-data collection — and believes this blocks ChatGPT from seeing their site. In reality, OAI-SearchBot and ChatGPT-User run unimpeded: your page can still appear in ChatGPT's live answers, and users can read it via link sharing. A training block and a visibility block aren't the same decision.

How to Make Conscious Decisions About All Six Bots

Before you allow or block any bot, it's worth answering two questions: Do I want my content added to AI model training? and Do I want live visibility in AI answers? These two decisions are separate, and together they lead to a clear answer for most site owners.

If you don't want your content in training material but you do want to appear in ChatGPT and Claude live answers: block GPTBot and ClaudeBot, but allow OAI-SearchBot and Claude-SearchBot. This is the configuration many industry experts recommend in 2026: the "training-free visibility" setup.

If you want both — training and live visibility — allow all six bots. If you want neither, a broad User-agent: * block works — but then your page won't appear in AI answers or training material.

The split looks like this at the robots.txt level:

# Training bots (block if you don't want training data)
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

# Search and response bots (allow if you want to be visible in AI answers)
User-agent: OAI-SearchBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

# User-context bots (generally worth allowing)
User-agent: ChatGPT-User
Allow: /

User-agent: Claude-User
Allow: /

Important: the Allow: / line isn't required if your site is open by default — you only need it if you're overriding a previously placed broad block. If there's no general block, the absence alone serves as permission.

What Bot Settings Don't Solve

Precise bot control is a necessary condition, but not sufficient. A site with no Google Business Profile, no structured data, or content that most AI bots can't see without JavaScript — such a site won't appear in ChatGPT and Claude answers just because you let the search bots in. Bot access grants entry; citability is determined by content and structure.

It's also worth being clear: bot access and AI recommendation aren't the same thing. The fact that Claude-SearchBot can visit your page doesn't mean Claude recommends your business. Recommendation — just like in traditional search — is primarily driven by external presence: review volume, directory listings, appearance on credible sources. Bot access creates the technical foundation; recommendation is the result of external presence built over months or years. I've written more about this in my post on GEO scores and how they differ from AI recommendation.

Bot settings also aren't a one-time check. AI companies regularly introduce new bots, change existing names, and update their robots.txt recommendations. In my post on robots.txt and AI bots, I cover in more detail how to audit your own file — and how to interpret what you find there.

How to Check Which Bots Visit You Today

The best method is your server access log: it shows which User-agents have accessed your site, when, and which URLs they downloaded. Most hosting providers make this available in the admin panel — usually under "access log" or "visitor logs." Search it for GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, ChatGPT-User, and Claude-User. Their presence means the bot can access you; their absence means either it's blocked or it simply hasn't visited yet.

If your server logs aren't accessible, a direct robots.txt check is also quick: open yoursite.com/robots.txt in your browser and look for bot names behind Disallow: / lines. If the file is empty or missing, it usually means all bots can come through — that's the default state. If there's a broad User-agent: * block, it locks out every bot, including search bots. Most of the time this configuration is worth refining.

As part of our AI-readiness audit methodology, I check bot access separately: I verify whether the critical bots are allowed, and if not, which setting blocks them. If you want to know which bots are currently visiting your site and exactly what your settings permit or block among all six, reach out — I can run a quick audit and show you which of the six bots you're missing.

Blocking GPTBot means: I don't want my content in model training. It doesn't mean: I don't want to appear in ChatGPT answers. Between that block and your visibility stand six different bots — and you need to know precisely what each one does.

Sources