Bbabo NET

Science & Technology News

Recommendations for identifying and blocking the GPTBot bot from OpenAI

The “Main Radio Frequency Center” (GRFC, part of Roskomnadzor) sent letters to Russian hosting providers with recommendations for identifying and blocking the OpenAI GPTBot bot.

The letter from the department states the need to assess the risks of collecting information about the vulnerability of resources or “other sensitive information, including containing personal data” that GPTBot can scan. If such risks are identified, the GRCH indicates the need to block the bot's requests. The agency also sent instructions to hosting providers on how to make the necessary blocks against GPTBot.

An employee of the duty shift of the Public Communications Network Monitoring and Management Center (CPM SSOP) of the GRChTs confirmed by phone to the media that such a letter had been sent. Roskomnadzor declined to comment. Hosting providers Beget and Rusonyx did not respond to the media. The Coordination Center for .RU/.РФ domains reported that they had not received the letter.

GPTBot functions as a web crawler on the Internet as part of its data mining efforts to improve the security, capabilities and accuracy of ChatGPT's artificial intelligence. The bot from OpeanAI identifies itself using the GPTBot user agent token and user agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot).

Previously, OpenAI released instructions for web developers on how to limit or block GPTBot to exclude the site from collecting data for ChatGPT training. Additionally, OpenAI provides a list of IP addresses from which GPTBot calls originate for verification purposes.

In early October, Google introduced the Google-Extended token for the robots.txt file, which tells Google crawlers to include the site in searches, but does not allow the site to be used to train AI systems like those that power Bard and Vertex AI's chatbot. including for future generations of the company’s AI models.

Recommendations for identifying and blocking the GPTBot bot from OpenAI