OpenAI has revealed details about its new internet crawler named GPTBot. You possibly can learn the documentation on GPTBot over here.
What’s GPTBot. GPTBot is OpenAI’s internet crawler, utilized by OpenAI to crawl the net, devour information for its AI options, corresponding to ChatGPT, and use that to offer AI-generaterd solutions to your questions.
Useragent. GPTBot’s Person agent token is “GPTBot” and its full user-agent string: is “Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; appropriate; GPTBot/1.0; +https://openai.com/gptbot)”.
Robots.txt. You should use your robots.txt to dam GPTBot from accessing all or elements of your web site. To disallow GPTBot to entry your web site you’ll be able to add the GPTBot to your web site’s robots.txt:
Person-agent: GPTBot
Disallow: /
To permit GPTBot to entry your solely elements of your web site you’ll be able to add the GPTBot token to your web site’s robots.txt like this:
Person-agent: GPTBot
Enable: /directory-1/
Disallow: /directory-2/
GPTBot IP ranges. OpenAI additionally revealed the IP ranges that GPTBot makes use of over here, it at the moment lists one, however I think they are going to add extra over time.
Why we care. If you do not want GPTBot crawling your site and/or using your content for its purposes, then you can disallow GPTBot from crawling your site. This is the same protocol you would use to block GoogleBot, BingBot or other web crawlers.