Bots - Allow AI Crawlers to Index Your Site
Learn how to ensure AI search bots from ChatGPT, Claude, Perplexity and others can crawl your site. Check robots.txt and firewall settings to avoid accidentally blocking AI indexing.
| ← Back to B.I.S.C.U.I.T. Framework | Next: Indexing → |
Problem
Some people have a knee-jerk reaction to AI bots crawling their site, believing that AI bots or their responsible companies are “stealing” your content and, therefore, should be blocked.
But there are many different types of “content” in the world.
Steven King’s content is NYT bestsellers like “IT”, and my content is literally this article, which, much to my dismay, is unlikely to be made into a movie with one of the Skaargard boys kitted out in scary clown makeup.

There’s a fine line between distribution and theft - but services that amplify the marketing messages you’re putting out in the world and provide you with greater reach and visibility aren’t ripping you off.
Action
Since I do want AI search services to index my site, rewrite my copy, and present it as an answer to questions that prospects are looking for, I need to make sure I’m not accidentally blocking any AI bots from crawling my site.
There are two ways you might be doing this:
Passively
Passively (aka “asking nicely”) with your sites ‘robots.txt’ file
You can check any website for robots.txt issues by appending that to their domain and looking at the result. <your-domain.com>/robots.txt
Actively
Actively with a website firewall or some type of security plugin.
Checking for active blocks is more difficult to do manually, which is why we developed an AI Search Console (accessible via the Knowatoa dashboard), which will check if any of the 24 different AI bots we monitor aren’t able to successfully reach your site.
If you’d rather manually check them, you could use an extension like User Agent Switcher and manually use the following user agents
AI2Bot
Mozilla/5.0 (compatible; AI2Bot/1.0; +http://www.allenai.org/crawler)
Amazonbot
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML\, like Gecko)
Version/8.0.2 Safari/600.2.5 (Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
Anthropic AI Bot
Mozilla/5.0 (compatible; anthropic-ai/1.0; +http://www.anthropic.com/bot.html)
Claude Web
Mozilla/5.0 (compatible; claude-web/1.0; +http://www.anthropic.com/bot.html)
ClaudeBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible;
ClaudeBot/1.0; +claudebot@anthropic.com)
Applebot-Extended
Mozilla/5.0 (compatible; Applebot-Extended/1.0; +http://www.apple.com/bot.html)
Applebot
Mozilla/5.0 (compatible; Applebot/1.0; +http://www.apple.com/bot.html)
BingBot
Mozilla/5.0 (compatible; BingBot/1.0; +http://www.bing.com/bot.html)
Bytespider
Mozilla/5.0 (compatible; Bytespider/1.0; +http://www.bytedance.com/bot.html)
GPTBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible;
GPTBot/1.1; +https://openai.com/gptbot
ChatGPT-User
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible;
ChatGPT-User/1.0; +https://openai.com/bot
OAI-SearchBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible;
OAI-SearchBot/1.0; +https://openai.com/searchbot
CCBot
Mozilla/5.0 (compatible; CCBot/1.0; +http://www.commoncrawl.org/bot.html)
DuckAssistBot
Mozilla/5.0 (compatible; DuckAssistBot/1.0; +http://www.duckduckgo.com/bot.html)
Google-Extended
Mozilla/5.0 (compatible; Google-Extended/1.0; +http://www.google.com/bot.html)
LinkedInBot
LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta
Commons-HttpClient/3.1 +http://www.linkedin.com)
Meta External Fetcher
Mozilla/5.0 (compatible; meta-externalagent/1.1
(+https://developers.facebook.com/docs/sharing/webmasters/crawler))
FacebookBot
Mozilla/5.0 (compatible; FacebookBot/1.0; +http://www.facebook.com/bot.html)
Omgili Bot
Mozilla/5.0 (compatible; omgili/1.0; +http://www.omgili.com/bot.html)
PerplexityBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible;
PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
YouBot
Mozilla/5.0 (compatible; YouBot (+http://www.you.com))
Cohere AI
Mozilla/5.0 (compatible; cohere-ai/1.0; +http://www.cohere.ai/bot.html)
Timpi
Timpibot/0.8 (+http://www.timpi.io)
DiffBot
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.2) Gecko/20090729
Firefox/3.5.2 (.NET CLR 3.5.30729; Diffbot/0.1; +http://www.diffbot.com)
Key Takeaways
- AI bots aren’t stealing your content—they’re amplifying your marketing reach
- Check both passive blocks (robots.txt) and active blocks (firewalls/security plugins)
- Use Knowatoa’s AI Search Console to automatically monitor bot access
- Allow all legitimate AI bots to crawl your site for maximum visibility
Test Your Site with AI Search Console
Want to know which AI bots can access your site right now? Use Knowatoa’s AI Search Console to automatically test all 24 AI bots and identify any blocks.
Next Step: Learn how to establish your brand as a distinct entity in AI search services → Continue to Indexing
