Google has confirmed that 2,500 pages of internal documents detailing how its internet search algorithms work that were leaked on Monday are authentic, The Verge reported Wednesday night and confirmed by CNET.
The massive leak of API documentation appears to confirm what search engine optimization experts have speculated for years but which are often denied by Google: For example, the leaked documents seem to indicate that click-through rate affects rankings, that subdomains have their own rankings, that new websites are thrown into a different “sandbox” until they rank well in search, and that domain age is a ranking factor.
The document was first leaked to Rand Fishkin, an SEO expert and co-founder of SparkToro and Snack Bar Studio, by Erfan Azimi, CEO of digital marketing firm EA Eagle Digital. The document was also leaked to Mike King of iPullRank.
In fairness, it remains to be seen how useful this leak is today: the inner workings of Google's search algorithm may be outdated by now, and these data points may have been collected but never used. Google also tends to tweak its search algorithms regularly. Still, this is a rare glimpse behind the scenes of Google's core business.
“We want to caution people not to make inaccurate inferences about searches based on out-of-context, outdated or incomplete information,” Google spokesman Davis Thompson said in a statement. Google has shared information about how search works in the past, while insisting that it “protects the integrity of search results from manipulation.”
Google is the most dominant player in online search, controlling more than 90% of the market share. Its dominance is the subject of a lawsuit filed by the U.S. Department of Justice against the company for maintaining a monopoly. Google is the main highway to the Internet for nearly every computer, iPhone, and Android device, giving the company great power over how information is consumed. Advertisements sold against search results are also the company's main source of revenue. Last year, Google generated $175 billion in revenue from search alone. The amount of money in online searches has spawned a $68 billion industry of SEO companies and experts who try to manipulate or predict the behavior of Google's search algorithms.
Google is fighting an ongoing battle against sites that stuff their search results with low-quality content just to get easy ad clicks. This is why Google doesn't publish transparent details about how its search algorithm works, otherwise bad actors would just take advantage of it. Publishers, blogs, and other small sites that produce quality content are caught in the middle of this battle. The problem of spam sites is made even worse by AI-generated content.
The changes made to Google's search algorithm last September, known as the “helpful content update,” were devastating for many small sites, including HouseFresh and RetroDodo, which have detailed reports on the impact of Google's decision.