Approximately 2,500 technical documents detailing the nuts and bolts of Google's ranking algorithm appear to have been leaked. If the documents are genuine, they offer an unprecedented glimpse into the workings of the internet's most popular search engine. This is also a huge mistake, as Google itself claims to have published the documents on GitHub before deleting them. However, nothing published on the web disappears overnight, and the documents have been archived elsewhere for posterity.
The leak provides an interesting opportunity to compare how Google actually ranks search results with the various claims the company has made about its largely mysterious black box until now. The inner workings of Google Search have long been speculated about, but they were never really known outside the company, or even to most Google employees internally.
The document was shared with longtime SEO expert Rand Fishkin by Erfan Azimi, an SEO advisor at EA Eagle Digital, who said he shared it in the hopes that it would expose “lies” being spread about Google's search platform.
This is obviously a very bold claim, and frankly, the document is incredibly dense and technical, covering a huge range of topics and systems. At a very high level, it covers the types and characteristics of the data Google collects and uses, which sites Google prioritizes on sensitive topics like elections, how Google handles small websites, and much more.
An analysis of the documents reveals some obvious contradictions to Google's claims. For example, in 2016, Google search engineer Paul Haahr said that “it's a mistake to directly attribute rankings to clicks.”
But the documents are alleged to prove that Google uses a system called NavBoost, which incorporates various click metrics directly into page rankings and search results.
Other points that contradict Google's previous claims include its use of domain authority and sandboxing new websites while collecting more data, including user data collected from the Chrome web browser.
If all of these allegations are true, it's unclear how much of this is simply because Google wants to protect its search IP from potential competitors, and how much of it has more cynical or sinister motives.
Your next machine
Best Gaming PC: Best pre-assembled machine.
Best Gaming Laptops: The perfect device for mobile gaming.
Moreover, as far as we can tell, the document doesn't actually reveal how Google currently ranks pages. In other words, this leak doesn't seem to make it easier to optimize web pages for better Google search rankings, as many observers probably would have hoped.
But if the document is genuine and the claims about its implications are broadly accurate, then at the very least Google is facing a pretty significant scandal over its past statements and its company's credibility and ethics.
For now, that's a pretty big “if.” This isn't something that's going to be resolved overnight. As far as we can tell, Google has yet to comment on whether the document is genuine, or to refute the main criticisms that follow.
No doubt Google is still working out the details of its response as we write this, but we have a feeling this isn't the end of the story and the full impact of this alleged scandal will be felt in the coming months, if not years.