Google Search Search engines are often described as gateways to the internet, the first stop for most people when they look for information online. But Google has never said much about how it organizes the internet, making search a giant black box that dictates what we know and don't know. This week, a 2,500-page leak first reported by a search engine optimization (SEO) veteran revealed that the company is “very good at finding information and is very good at finding things.” Rand Fishkinhas solved a 26-year-old Google search mystery for the world.
“I think the biggest takeaway is that there is a distinction between what Google representatives say and what Google's search engine actually does,” Fishkin said in an emailed statement to Gizmodo.
These documents provide a more detailed look at how Google Search controls the information we consume. Getting the right webpage to appear on your computer isn't a passive task: thousands of editorial decisions are made on your behalf by a secretive group of Googlers. For SEO, an industry that succeeds or fails on Google's algorithms, the leaked documents are a shock. It's like an NFL referee rewriting the rules of football mid-season and finding out about it in the middle of the Super Bowl.
Multiple SEO experts told Gizmodo that the leak lists 14,000 ranking factors that, at a minimum, provide a blueprint for how Google organizes all the information on the web. These factors include how Google determines a website's authority on a particular subject, the size of the website, and the number of clicks a web page receives. Google had previously denied that it uses some of these ranking factors in search, but the company confirmed that the documents, although incomplete, are genuine.
“We want to be careful not to make inaccurate inferences about searches based on out-of-context, out-of-date, or incomplete information,” a Google spokesperson said in an email to Gizmodo. “We share extensive information about how search works and the types of factors our system emphasizes, and we also work to protect the integrity of our search results from manipulation.”
As for Google's “warning,” the company hasn't revealed what's right or wrong in these documents. Google told Gizmodo that it's a mistake to assume this is comprehensive information about search, and revealing too much information could encourage malicious behavior. Ultimately, we don't know what influences these factors in determining them, or how much weight Google Search gives to each factor.
“We're just looking at the different variables that they're considering,” SEO expert Mike King, one of the first to analyze the leak, told Gizmodo in an interview. “This is [Google] “I'll look at the website.”
The leak was first noticed by SEO expert Erfan Azimi, who found the API documentation publicly available on GitHub. It's unclear whether the documents were truly “leaked” or if Google had published them, perhaps accidentally, in a quiet corner of the web. Azimi brought the documents to Fishkin last week in an attempt to make them public. Fishkin asked King to try to make sense of the documents.
KING has a ranking function called “Homepage Page Rank N” that ranks the homepage of a website based on its popularity. Support everything you publishFishkin wrote that the leak was about a system called NavBoost, which was first mentioned by Google vice president of search Pandu Nayak in Justice Department testimony. Measuring clicks to improve your ranking in Google searchMany in the SEO industry see these documents as confirmation of what the industry has long suspected: websites that are deemed popular by Google can receive higher search rankings for queries, even if lesser-known sites have better information.
In recent months, several small publishers Google search traffic disappearedWhen The Verge's Nilay Patel asked about Google CEO Sundar Pichai last week, Pichai responded: “It's not clear that this is a uniform trend.”One ranking feature King points out seems to categorize these smaller sites as a single entity.
“There is a feature called 'smallPersonalSite', and of course I don't know how it's used, but it's [Google] “We're trying to understand if these are smaller sites,” King said. “Right now, a lot of the smaller sites are going under. [Google] “We're not doing anything to offset the signals of these big brands.”
Notably, Pichai later said in an interview with The Verge that Google has also funneled traffic to smaller sites at times. These ranking features could be indicative of levers Google could use. As more national media organizations license their content to ChatGPT, Google Search also appears to be biased toward larger publishers. Overall, this could have a stifling effect, compressing what most people hear to just mainstream media organizations.
The impact of the leaked Google docs was widespread. Kristen Ruby, CEO of Ruby Media Group, who has worked in digital PR and SEO for over 15 years, told Gizmodo: Ominous text Monday night: “Something terrible is going to happen to Google tomorrow.”
Ruby quickly spotted the leak and noted that two ranking features stood out: “isElectionAuthority” and “isCovidLocalAuthority.” These features appear to be how Google ranks the trustworthiness of web pages that provide pertinent information about elections and COVID-19, respectively. In 2019, Ruby How Google evaluates trustworthy web pages (Google Eat(an acronym for experience, expertise, authority and trust) are inherently political, she points out, and Google's measurement of these factors tends to be politically biased.
“It's troubling that Google doesn't provide context for important items in their data, like 'isElectionAuthority' or 'isCovidLocalAuthority' – how does Google define authority in these important areas?,” Luby said in an email. “We shouldn't have to guess what the answers are – Google should be up front and tell us the answers.”
While Google is a company and has a right to personal information, Luby argues that Google has an obligation to answer questions about these ranking features that shape the world around us. In their article on the leak, King and Fishkin also note that “isCovidLocalAuthority” and “isElectionAuthority” are both important in helping search engines boost quality information.
“Whether we like it or not, Google is effectively a public service, so I think it's really important that they provide that kind of discernment with information,” King said. “They might balk at me saying that, but we view Google as the primary source for information on the web.”
The way Google ranks information in these examples is a microcosm of the entire search ecosystem. Millions of questions arise every day about which information to amplify and which to silence. Google and several tech companies have long tried to present themselves as opinion-free algorithms, but these ranking features show that this is not the case. Many more examples of ranking features are revealed in the 2,500-page leak.
Finding the Answer in Google's Algorithm
Google has not gone into detail about these documents, telling Gizmodo that revealing too much information could encourage malicious activity, so it's up to SEO experts to unravel this on behalf of everyone who uses Google Search. Some of the 14,000 ranking features identified last week are ones that Google has explicitly claimed it hasn't used for years.
In a 2016 video, a Google Search representative declared: “We There is no website authority score.In a 2015 interview, another Googler said:Using clicks directly for rankings is a mistake.Given the leaked documents and Google's response, it's hard to make sense of these comments.
“This response is a perfect example of why people hate and don't trust Google,” Fishkin said. “It's a nonsensical statement that doesn't mention the leak, offers no value, and was likely written by an AI trained on the most soulless corporate messaging of the last decade.”
In the age of AI-driven answers, Luby noted, the way Google ranks web pages is more important than ever: Instead of a series of links to different perspectives, AI might give you one clear answer. Google's new AI overviewBut a decade-old Reddit post has become oddly authoritative and appealing to some users. Applying glue to pizzaNow, only the top result may have any say, so how Google chooses its authority becomes even more important.
“We're pivoting. We're moving from one search system to another,” Luby said. “AI is having a huge impact on search results.”
Ultimately, we don't know what Google is actually doing with these ranking features. What we do know is that Google created these classifiers, and likely many more, to rank websites across the Internet. These rankings clearly involve judgment, providing further evidence that Google Search is not an objective experience, but rather a series of editorial choices made by people inside Google.