Let's start by saying that the Google search algorithm hasn't been leaked, and that SEO experts don't suddenly have all the answers, but the information leaked this week (a collection of thousands of internal Google documents) is still huge, and offers an unprecedented glimpse into the normally closely guarded inner workings of Google.
Perhaps the most notable revelation from the 2,500 documents is that they suggest Google representatives have misled the public in the past when discussing how the Internet's largest gatekeeper evaluates and ranks content on its search engine.
How Google ranks content is a black box. Websites depend on search traffic to survive, and many go to great lengths and expense to beat out their competitors and rise to the top of search results. Higher rankings mean more traffic to their website and more revenue. As a result, website owners hang on to every word Google publishes and social media posts made by search employees. Their words are taken as absolute truth and are heard by everyone who uses Google to search.
For example, while Google spokespeople have long maintained that user clicks don't affect website rankings, leaked documents show that the several types of clicks users make affect search page rankings. Testimony in the U.S. Department of Justice antitrust lawsuit previously revealed a ranking factor called Navboost that uses searcher clicks to boost content higher in search results.
“To me, the larger meta lesson is that Google's public statements about what it collects and how its search engine works now have even stronger refutations,” said Rand Fishkin, a search engine optimization (SEO) industry veteran. The Verge on mail.
The leak first spread after SEO experts Fishkin and Mike King published some of the leaked documents' contents along with their analysis earlier this week. The leaked API documents contain a repository full of information and definitions about the data Google collects, some of which may relate to how web pages are ranked in search. Google initially dodged questions about the authenticity of the leaked documents, but confirmed their authenticity on Wednesday.
“We are careful not to make inaccurate inferences about searches based on information that is out of context, out of date or incomplete,” Google spokesman Davis Thompson said. The Verge “We are sharing extensive information about how search works and the types of factors our system weighs, as well as working to protect the integrity of search results from manipulation,” it said in an email on Wednesday.
First, the document does not show how the various attributes are weighted, and it is possible that some of the attributes it mentions, such as the “small personal site” identifier or the demotion of product reviews, were introduced at some point and then phased out, or may never have been used in ranking sites at all.
“We don't necessarily [the factors named] “Aside from the various descriptions, there's a wide variety of terminology used, but even if the terminology is a little sparse, there's a lot of information for us,” King says. “What are the more specific aspects to consider when creating or optimizing a website?”
The claim that the world's largest search platform doesn't rank search results based on how users engage with content seems absurd at first glance, but repeated denials, carefully worded company responses, and industry publications that publish the claim without question have made it a hot topic of discussion among SEO marketers.
Another key point Fishkin and King highlighted was how Google uses Chrome data for search rankings. Google Search representatives have said that they don't use any Chrome data for rankings, but the leaked documents suggest that may not be true. For example, one section lists “chrome_trans_clicks” as indicating which links from a domain appear below the main webpage in search results. Fishkin interprets this to mean that Google “uses the number of page clicks in the Chrome browser, which we use to determine the most popular/important URLs on a site, which we use to calculate which URLs to include in the Sitelinks feature.”
The document mentions more than 14,000 attributes, and researchers will likely be sifting through the pages for weeks looking for clues. It mentions “Twiddlers,” ranking boosts that are rolled out separately from major system updates, which will boost or downgrade content according to certain criteria. It also mentions elements of web pages, such as who the author is, and measuring a website's “authority.” Fishkin points out that there's a lot more information not mentioned in the document, including information about AI-generated search results.
So what does this mean for everyone outside the SEO industry? First, anyone who runs a website will be reading about and trying to understand this leak. Many SEOs are trying different things to see what works, and publishers, e-commerce companies, and businesses will likely plan different experiments to test some of the things suggested in the document. I imagine that websites will look, feel, or read a little differently when this happens. All this as these industries try to make sense of this new and still vague wave of information.
“Journalists and publishers who publish about SEO and Google search need to stop uncritically repeating Google's public statements and take a tougher, more adversarial view of representatives of the search giant,” Fishkin said. “When publications repeat Google's claims as if they were fact, they are helping Google craft a narrative that is only useful to the company and not to practitioners, users, or the public.”