Thousands of pages of internal documents leaked earlier this week offered unprecedented insight into how Google's search algorithm works, The Verge reports, adding that the documents suggest that Google may not have been fully transparent about its processes.
Google confirmed the authenticity of the documents in a statement to the magazine on May 30. “We want to caution people against making inaccurate inferences about search based on information that is out of context, out of date, or incomplete. We share extensive information about how search works and the types of factors our system weighs, and we are also working to protect the integrity of our search results from manipulation,” Google spokesperson Davis Thompson said.
Google's search algorithms, in particular, play a major role in determining which websites succeed and which fail. But even as journalists, researchers and SEO experts piece together all the information they can, the details remain a mystery.
Related article: OpenAI confirmed to focus on generative AI features in iOS 18, Google still considering: Report
Insider Revelation
SEO expert Rand Fishkin wrote in a May 29 article that he had received 2,500 pages of leaked documents from a source who hoped they could counter misleading information Google employees have previously shared about its search algorithm.
Fishkin and SEO expert Mike King analyzed the leaked documents in a blog post titled “Leak Source: Anonymous Source Shares Thousands of Leaked Google Search API Documents with Me. All SEOs Should Check Them Out.”
He said the documents detail Google's search APIs and outline the information available to employees. The Verge reports that the documents are dense in technical content but easy to understand for developers and SEO professionals.
Related article | Google's AI-powered search tells users to eat rocks and put glue on pizza. Netizens say “Just do it offline”
He said that while the documents don't directly prove that Google uses all the data and signals mentioned for search rankings, they do show what data Google collects from websites and searchers. As SEO expert Mike King pointed out in the leaked article, the information provides indirect clues about Google's priorities.
Key insights from the leak
The leaked documents include details about the types of data Google collects, how it prioritizes certain sites on sensitive topics like elections, how it handles smaller websites, etc. The report also said that some details within the documents appear to contradict public statements made by Google representatives.
“'Lie' is a harsh word, but it's the only appropriate word to use here. I don't necessarily blame Google representatives for protecting their proprietary information, but I do take issue with the company's efforts to actively discredit people in marketing, technology and journalism who have published reproducible findings,” King said.
Also Read | Mint Explained: Why Alphabet, Apple, and Meta are in the EU's crosshairs
Google did not respond to The Verge's request for comment on the documents. Fishkin said Google is not disputing the legitimacy of the leaks but has asked for changes to the language it uses to describe certain events.
Impact on the SEO industry
Google's secretive algorithms have spawned an entire industry of marketers who meticulously adhere to Google's public guidelines to optimize their websites and boost their rankings, The Verge report said. This has led to widespread criticism that Google search results are cluttered with low-quality content created to meet those guidelines, the report added. In response to such criticism, Google often defends itself by citing the guidelines.
The leaked documents call into question the accuracy of Google's public statements about its search operations. For example, King points out that while Google says it doesn't use Chrome data to rank pages, the documents suggest otherwise. The documents mention Chrome in a section discussing how websites appear in search.
Related article | Google's parent company Alphabet surpasses $2 trillion market capitalization with the power of AI
Another hot topic of debate is the role of EEAT (Experience, Expertise, Authoritativeness, and Credibility) in rankings. Google claims that EEAT is not a ranking factor, but documents show that Google tracks author data, suggesting a connection to rankings. However, Google maintains that author bylines are for readers and do not influence rankings, the Verge report said.
Future outlook
While the leaked documents don't provide conclusive evidence of all of Google's practices, they do offer rare in-depth insight into the company's search algorithms. An ongoing U.S. antitrust lawsuit against Google that focuses on search has also led to the release of internal documents that reveal further details about Google's operations.
The lack of transparency about Google's algorithms has led to homogenization of website content as SEO marketers try to decipher Google's hints. In his post, Fishkin criticized publications that uncritically accept Google's statements and urged them to scrutinize the company's claims more closely.
ALSO READ | Sundar Pichai says 'dance to your own music and micro-focus on small things in context' as AI race heats up for big tech companies
“Historically, some of the search industry's loudest and most prolific publishers have been content to uncritically repeat Google's public statements. They write headlines that say 'Google says XYZ is true,' rather than 'Google asserts XYZ, but the evidence suggests otherwise.' Please, do better. If this leak and DOJ trial can produce just one change, I hope it is this,” Fishkin said.
You are on Mint! India's No.1 News Site (Source: Press Gazette). For more of our business coverage and market insights, click here!