The next step in ChatGPT's rapid rise is the adoption of GPTBot. This new iteration of OpenAI's technology includes crawling web pages to deepen the output that ChatGPT can provide.
Improvements in AI seem positive, but they are not as clear-cut. There are legal and ethical issues with this technology.
The arrival of GPTBot has highlighted these concerns, as many major brands are blocking it instead of leveraging its potential.
However, I truly believe that there is much more to be gained than lost by fully (and responsibly) embracing GPTBot.
Why do AI bots like GPTBot crawl my website?
Understanding why bots like GPTBot behave the way they do is the first step to adopting this technology and leveraging its potential.
Simply put, bots like GPTBot crawl websites and collect information. The main difference is that instead of the AI platform being passively fed data for learning (a “training set”), the bots can actively track information on the web by crawling different pages. .
Large-scale language models (LLMs) scour these websites to understand the world around us. Google's C4 dataset accounts for the majority of these LLM training bodies (15.7 million sites). It also crawls other authoritative information sites such as Wikipedia and Reddit.
The more sites these bots can crawl, the more they can learn and become better. So why are companies blocking his GPBot crawl?
Are there legitimate concerns about brands blocking GPTBot?
When I first read about companies preventing GPBot from crawling their websites, I was perplexed and surprised.
To me, that seemed incredibly short-sighted. But I thought there must be a lot of things to consider that I haven't thought about deeply enough.
After researching and speaking with legal agency experts, we found the biggest reason.
Lack of compensation for unique training data
Many brands prevent GPBot from crawling their sites because they don't want their data to be used to train models for free. I understand wanting a piece of their billion dollar pie, but I think this is a short-sighted view.
ChatGPT is an answer engine for the world, just like Google and YouTube. Preventing GPTBot from crawling your content may limit your brand's reach to a small number of internet users in the future.
security concerns
Another reason behind the anti-GPTBot sentiment is security. While more legitimate than greedily hoarding data, it's still a largely unfounded concern from my perspective.
All websites should be pretty great by now. Safe. Needless to say, the content GPTBot is trying to access is public, non-confidential content. It's the same stuff that Google, Bing, and other search engines crawl every day.
What cache of sensitive information do CIOs, CEOs, and other company leaders expect GPBot to access during a crawl? And with proper security measures in place, shouldn't this be a problem? Or?
The threat of looming legal repercussions
From a legal perspective, the argument is that any crawl that occurs on a brand's site must be covered by that brand's privacy disclaimer. All websites must include a privacy disclaimer that explains how the data collected by the service is used. Lawyers argue that the language should also specify that generative AI third-party platforms may crawl the collected data.
Otherwise, personally identifiable information (PII) and customer data may still be “public” and expose your brand to Section 5 Federal Trade Commission (FTC) claims for unfair and deceptive trade practices. there is.
This concern is understandable to some extent. If you are the legal department of a well-known brand, one of your main objectives is to keep your company out of trouble. But this legal concern applies to much more. input into the It's not what GPTBot crawls, it's ChatGPT.
Anything entered into OpenAI's platform becomes part of its data bank and can be shared with other users, leading to data leaks. However, this can only happen if the user asks questions regarding the stored information.
This is also an unfounded concern for me, as it can all be solved by using the internet responsibly. The same data principles we've used since the early days of the web still apply today. Don't enter any information you don't want to share.
The urge to save humanity from AI advances
I can't help but think that the leaders of some of the brands blocking GPTBot are biased against advances in AI technology.
We often fear what we don't understand, and some people are frightened by the idea of advances in artificial intelligence. too much many knowledge and rise too much powerful.
Although AI is rapidly evolving and becoming able to “think” more deeply, it is still largely controlled by humans. Additionally, the laws governing AI will grow with the technology.
When we finally reach the world of “autonomous” AI platforms, their capabilities will be guided by years of experience. human Innovation and Law.
Get the daily newsletter search that marketers rely on.
3 reasons not to block ChatGPT's GPTBot
So why should you allow GPTBot to crawl your site? Let's look on the bright side with these three key benefits of adopting OpenAI's bot technology.
1. 100 million people use ChatGPT every week
If you don't allow GPTBot to crawl your site, you'll be missing out on 100 million viewers who won't be able to maximize brand awareness.
Sharing access to your website's content helps ensure that your brand is represented in a factual and positive way to ChatGPT users.
This means your brand is more likely to actually be recommended by ChatGPT, leading to more traffic and potential customers.
Some brands report getting 5% of their overall leads or $100,000 in monthly subscription revenue from ChatGPT. We know that our agency has already received some leads from ChatGPT.
You can also think of this as positive digital PR (DPR). In today's climate, you need to leverage his DPR strategies such as brand mention campaigns.
Allowing GPTBot to crawl your site will only strengthen these efforts by allowing ChatGPT to access brand information directly from the source and proactively distribute it to its 100 million users.
2. Generation Engine Optimization (GEO)
Whether you have concerns about AI or not, we can all agree that AI is changing the marketing landscape. As with all new technologies and trends in our industry, companies that are slow to embrace AI as a conduit for new business and brand exposure will miss the proverbial boat.
GEO is gaining momentum as a sub-practice of SEO. If you don't target some of your marketing efforts to this market, you'll be missing out on important opportunities. If you slip through the cracks, your competitors may pick it up.
We know that in today's fragmented and ever-growing marketing environment, it's easy for brands to fall behind. If your competitor has spent years working on his GEO, maximizing the visibility of his LLM and developing skills and expertise in this area, then he is years ahead of you. It will be.
Currently, GEO's reporting capabilities have not yet caught up to its value. This means measuring ROI can be difficult, but that doesn't mean ignoring it and falling behind.
Brands and marketers should start embracing LLMs like ChatGPT as a new acquisition channel that should not be ignored.
3. OpenAI’s Commitment to Minimize Damage
A healthy distrust of AI technology is critical to its legal and ethical growth. But we also need to be open-minded and realize that if we resist things and choose not to grow and innovate, we will not be as effective as marketers.
OpenAI clearly states “minimize harm” as one of the platform's guiding principles. They also say they have a policy of respecting copyright and intellectual property, and that GPTBot will filter out sources that violate their policies.
By allowing GPTBot to crawl your site's content, you are contributing clean and accurate training data that OpenAI uses to enhance and improve the accuracy of your information.
As AI technology advances, it becomes easy to get caught up in skepticism, fear, and noise. Those who struggle to embrace it and make the most of it will be left behind.
The opinions expressed in this article are those of the guest author and not necessarily those of Search Engine Land. Staff authors are listed here.