Three of the four major AI search platforms actively read and respect llms.txt files. ChatGPT (GPTBot), Claude (ClaudeBot), and Perplexity (PerplexityBot) all crawl llms.txt as part of their content discovery process. Google, the fourth, does not — and has explicitly signaled it has no plans to support it.
That gap matters. Google's dismissal of llms.txt has led some SEO professionals to write off the standard entirely. But Google is no longer the only discovery channel that counts. ChatGPT has over 400 million weekly active users. Perplexity processes more than 100 million queries per day. Claude's responses are read by developers, researchers, and professionals making decisions in every industry. For AI‑driven discovery — which is growing faster than traditional search — llms.txt already has meaningful reach independent of Google.
This guide is the authoritative reference: every major AI crawler, its exact user‑agent string, whether it reads llms.txt, and what the evidence says. If you've found conflicting information elsewhere, bookmark this one.
The Complete AI Crawler Reference Table
Eight crawlers. One table. Sorted by llms.txt support status.
| Crawler | Company | User‑Agent String | Reads llms.txt? | Evidence |
|---|---|---|---|---|
| GPTBot | OpenAI | GPTBot |
Yes | OpenAI documentation; observed crawl behavior |
| ClaudeBot | Anthropic | ClaudeBot |
Yes | Anthropic crawler documentation |
| PerplexityBot | Perplexity AI | PerplexityBot |
Yes | Perplexity documentation; citation behavior |
| Googlebot | Googlebot |
No | Gary Illyes statement, July 2025 | |
| Google‑Extended | Google-Extended |
No | No llms.txt support; separate AI training opt‑out mechanism | |
| Bingbot | Microsoft | Bingbot |
Evolving | No official statement; partial observed behavior in 2025‑2026 |
| Meta‑ExternalAgent | Meta | Meta-ExternalAgent |
Unknown | No public documentation |
| Applebot | Apple | Applebot |
Unknown | No public documentation |
Bottom line: If you're implementing llms.txt to be discovered by AI assistants, the top three crawlers are what matter. The file takes under 5 minutes to create with a free llms.txt generator.
GPTBot - OpenAI's Web Crawler
GPTBot is OpenAI's primary web crawler, used to fetch content for training ChatGPT and powering ChatGPT's browsing and search features. It is one of the best‑documented AI crawlers, with OpenAI publishing explicit user‑agent details, IP ranges, and a dedicated documentation page.
User‑Agent and Identification
- Short user‑agent:
GPTBot - Full example:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.1; +https://openai.com/gptbot) - IP ranges: Published at
openai.com/gptbot - To block:
User-agent: GPTBotplusDisallow: /inrobots.txt
How GPTBot Uses llms.txt
When GPTBot encounters a site, it can use an llms.txt file to understand site structure and prioritize pages. Rather than crawling every URL in a sitemap, GPTBot follows the curated page list in llms.txt to find the most relevant content. This matters specifically for ChatGPT's live web browsing feature, which uses a real‑time retrieval path — not just training data — to answer queries with cited sources.
Sites with a well‑structured llms.txt give GPTBot a cleaner signal about which pages represent core content. The effect is less about influencing training data and more about ensuring the right pages are cited when ChatGPT answers questions in your topic area.
ClaudeBot - Anthropic's Web Crawler
ClaudeBot is Anthropic's crawler, used to retrieve web content for Claude's responses and for ongoing model development. Anthropic has published documentation on ClaudeBot's behavior, user‑agent string, and how site operators can control its access.
User‑Agent and Identification
- Short user‑agent:
ClaudeBot - Full example:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +anthropic.com/claudebot) - To block:
User-agent: ClaudeBotplusDisallow: /inrobots.txt
How ClaudeBot Uses llms.txt
ClaudeBot uses llms.txt to understand site structure before and during crawls. When Claude retrieves content for a response — particularly in its web‑search‑enabled mode — a well‑structured llms.txt helps surface the right pages rather than leaving content discovery entirely to crawl heuristics.
One practical consideration: Claude is heavily used by developers, researchers, and technical professionals — exactly the high‑intent audience most site owners want to reach. If your content serves a professional or technical audience, generating and uploading a llms.txt file is a low‑effort, high‑relevance investment.
PerplexityBot - Perplexity AI's Crawler
PerplexityBot is the crawler behind Perplexity AI, one of the fastest‑growing AI search engines. Perplexity is distinctive because it is built entirely around web retrieval — every answer includes cited sources, and PerplexityBot is core to the product rather than an optional feature.
User‑Agent and Identification
- Short user‑agent:
PerplexityBot - Full example:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot) - To block:
User-agent: PerplexityBotplusDisallow: /inrobots.txt
Why Perplexity and llms.txt Are a Natural Fit
Perplexity's product is built around surfacing and citing sources. A clear llms.txt file helps PerplexityBot identify your site's most valuable pages faster, which directly increases the chance that Perplexity cites the right content when users ask questions in your area. Perplexity processes over 100 million queries per day — and every Perplexity answer visibly credits its sources, driving direct‑referral traffic from high‑intent users.
Google's Position: Why Googlebot Does Not Read llms.txt
In July 2025, Google's Search Relations representative Gary Illyes made a public statement comparing llms.txt to the discredited \"keywords\" meta tag — a standard Google abandoned after it was gamed into uselessness. Illyes indicated Google does not plan to support llms.txt for Googlebot or any Google AI features.
This statement gets quoted often, but the full context matters:
- Google‑Extended is a separate mechanism.
Google-Extendedis Google's user‑agent for AI training opt‑outs (Gemini, Vertex AI) — it has no relationship to llms.txt content discovery. - Google AI Overviews are not affected by llms.txt. Google generates AI Overviews from its own search index. Since Googlebot ignores llms.txt, the file has no influence on AI Overview generation.
- The keywords meta comparison is contested. The keywords meta tag failed because spammers exploited it. llms.txt is read during real‑time retrieval by AI assistants that users actively query — a fundamentally different mechanism.
The practical takeaway: Do not implement llms.txt expecting it to improve Google rankings or AI Overviews. Implement it for the three crawlers that demonstrably read it: GPTBot, ClaudeBot, and PerplexityBot.
The Other Crawlers: Bing, Meta, and Apple
Three other major platforms run web crawlers that may interact with llms.txt — but none have published official documentation either supporting or dismissing it.
Bingbot and Microsoft Copilot
Microsoft has invested heavily in AI search through Copilot. As of May 2026, Microsoft has not published official llms.txt documentation. Some site operators report what appears to be llms.txt‑referencing behavior from Microsoft crawlers, but this remains unverified. If Bing follows OpenAI's lead on the standard, official support could arrive — watch Microsoft's crawler documentation for updates.
Meta‑ExternalAgent
Meta runs Meta-ExternalAgent for external content retrieval used in Meta AI and related products. No public documentation exists on llms.txt support. Meta's AI products are growing rapidly, but no verified reports confirm that llms.txt influences Meta AI's content retrieval.
Applebot
Apple's Applebot powers Siri, Spotlight, and Safari suggestions. No public statement on llms.txt support exists. Apple's web retrieval is narrower in scope than the major AI search platforms; this is a low‑priority watch item for most site operators.
What This Means for Your Content Strategy
If you're deciding whether to implement llms.txt based on which crawlers support it, the answer is straightforward: three major AI platforms already read it, representing hundreds of millions of active users. Google's non‑support is a real limitation for traditional SEO, but it doesn't affect AI discovery channels — which are growing.
Here is how to think about the decision:
- Implement now if: You care about appearing in ChatGPT, Claude, or Perplexity responses. Your audience includes developers, researchers, or professionals. You're building topical authority in a niche where AI citations drive traffic.
- What llms.txt does for supporting crawlers: Helps AI systems identify which pages represent your best content, improves citation accuracy, and reduces the chance high‑quality pages are overlooked during retrieval.
- What llms.txt does not do: Improve Google rankings, guarantee AI citations, replace a well‑structured sitemap, or prevent crawlers from accessing pages you don't include.
For the full technical standard and implementation background, see our complete guide to llms.txt. For platform‑specific implementation, start with adding llms.txt to WordPress — the most common deployment scenario.
Conclusion
The AI crawler landscape in 2026 divides cleanly: GPTBot, ClaudeBot, and PerplexityBot actively read llms.txt; Googlebot does not. The three that support it represent the major AI search and assistant platforms that hundreds of millions of people use daily for research, discovery, and decision‑making.
If you want your content to be discovered and cited accurately by AI systems, generate your free llms.txt file and upload it to your site root. The file is small, has no performance overhead, and takes under 5 minutes to create.
Frequently Asked Questions
Does Google read llms.txt?
No. Google's Gary Illyes stated in July 2025 that Google does not support llms.txt. Googlebot and Google‑Extended both ignore llms.txt files. This does not affect your Google rankings either positively or negatively — the file is simply not used in Google's crawl or ranking process.
What is GPTBot's exact user‑agent string?
The short identifier is GPTBot. The full string is Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.1; +https://openai.com/gptbot). To block GPTBot in robots.txt, use User-agent: GPTBot on one line and Disallow: / on the next.
How can I tell which AI bots are actually crawling my llms.txt?
Check your server access logs for entries containing GPTBot, ClaudeBot, and PerplexityBot. Most shared hosting control panels expose raw access logs, and tools like Cloudflare Analytics provide bot traffic breakdowns. You can also ask Claude or Perplexity directly: \"Visit [yourdomain.com/llms.txt] and tell me what this site is about\" — a coherent, accurate response confirms the file is being read.
Does llms.txt work like robots.txt for AI crawlers?
No — they serve opposite purposes. robots.txt restricts crawler access (it says \"don't go here\"). llms.txt guides crawler attention (it says \"these are my most important pages\"). AI crawlers that support llms.txt use it as a content map, not a permissions file. You still need robots.txt if you want to block specific crawlers from your site.
How do I block an AI crawler I don't want on my site?
Add the crawler to your robots.txt file using its user‑agent string. For example, to block GPTBot: add User-agent: GPTBot on one line, then Disallow: / on the next. Repeat for each crawler you want to exclude. Well‑behaved crawlers like GPTBot, ClaudeBot, and PerplexityBot respect robots.txt directives.
Do I need to reference llms.txt in my robots.txt file?
No. AI crawlers that support llms.txt look for it automatically at yourdomain.com/llms.txt — no announcement needed. The convention, like robots.txt itself, is based on a standard file location that compliant crawlers know to check.
Does having llms.txt help with Google AI Overviews?
No. Google AI Overviews are generated from Google's own search index built by Googlebot. Since Googlebot ignores llms.txt, the file has zero influence on AI Overview generation. To appear in Google AI Overviews, focus on traditional SEO signals: topical authority, structured data markup, and high‑quality content Google already indexes.
How often do AI crawlers re‑crawl llms.txt?
No AI platform has published an official re‑crawl frequency for llms.txt. Based on observed behavior, AI crawlers tend to revisit active sites every few weeks to a few months, depending on site freshness signals. If you update your llms.txt after publishing significant new content, the updated version will be picked up on the next crawl cycle.
What's the difference between ClaudeBot and Claude's live web browsing?
ClaudeBot is Anthropic's background crawler that periodically indexes the web. Claude's live web browsing is a separate real‑time retrieval path used when Claude answers a question requiring current information. Both can interact with llms.txt — ClaudeBot during periodic indexing, live browsing during active query resolution. A well‑structured llms.txt helps both paths find your most relevant content.
Which AI crawler should I optimize for first?
If you can only focus on one, optimize for PerplexityBot — Perplexity's product is built entirely around web citation, so your content has the highest probability of appearing in user‑visible responses with source credits. In practice, a single well‑structured llms.txt file serves all three major crawlers simultaneously with no extra work required.
