New Feature Addition · World management · Created by
notavi
accepted
chatgpt crawlers
What functionality is missing? What is unsatisfying with the current situation?
While OpenAI supports a way to opt-out of having your work crawled via robots.txt but since that is sitewide it obviously isn't something that can be managed on a per world or per account basis.
Documentation here: https://platform.openai.com/docs/gptbot
Ideally there would be an option in world settings that could be set that would block crawling. Since it's probably too cumbersome to put an entry in robots.txt for each individual world (especially given its 500kb limit) it may be necessary to change the world URL either for opted in or opted out worlds - for example:
robots.txt
User-agent: GPTBot
Disallow: /wn/*
Where worlds that choose to block crawling would have their world URL change slightly to match the pattern by changing the /w/ in their base URL to /wn/. A similar approach could be taken if this feature is opt out (block for /w/, but allow crawling for /wc/).
An alternate option would be to simply return a 403 Forbidden based on the user agent whenever the GPT crawler attempts to access a world that does not wish to be crawled by AI. This would work well as either an alternate implementation or a supplemental control.
How does this feature request address the current situation?
It allows creators who don't want their work crawled by OpenAI or used as AI training data some ability to prevent that.
What are other uses for this feature request?
As is, there aren't any other uses for this request but an extended version of this feature request might provide multiple levels of opt-in (All Crawlers, Search Only or No Crawlers) depending on the creators preferences.
The Team's Response
Thanks for your suggestion and to everyone who participated! Unfortuantely, the current solution provided by ChatGPT doesn't work for us, as we have a large amount of users with different opinions regarding this matter and don't want to implement an all-or-nothing solution. That said, we are accepting the suggestion pending a better solution from them. If they implement a method we can use on a user level, we will implement it. While we don't have a specific stance about AI, we know that many of our users do, and we'll always defend your work and ownership.
Note that private content can't be crawled by bots, so depending on your usecase this could be a workaround.
Current score
161/300 Votes · +39507 points