Leaking Pipe: Search Engine Optimization (SEO) techniques aim to improve the quality, and more commonly, quantity, of keywords in order to “game” Google's algorithms and climb the ranks of the search engine results pages (SERPs). No one outside of Google knows exactly how these algorithms work, but a recent leak may reveal some of the Internet's most closely guarded secrets.
A Google automated bot recently committed confidential documents to GitHub that explained how to use the company's Content Warehouse API. The commit appears to have been a mistake, and Google has since tried, but failed, to revert the leak. With secrets out, SEO experts are studying the leaked documents to try to uncover the capabilities of the Content Warehouse API.
Erfan Azimi, CEO of SEO firm EA Digital Eagle, first discovered the Google documentation, which he then shared with other SEO professionals. The Content Warehouse API appears to be a tool intended for internal use by Google employees.
The mistaken commit revealed previously unknown details about how Google's search engine works and the thousands of attributes used in the Content Warehouse API. Google Search uses over 14,000 different attributes to classify web content, but does not provide details about how much “weight” each attribute has in search indexing.
The leaked documents also refute some of Google's previous statements about search, such as that click-centric user signals are not taken into account in indexing content. Google has said that subdomains are considered separately in rankings, but the Content Warehouse documents do not back up that claim. Other inconsistencies include its use of sandboxes for new websites and assigning “authority scores” to help sites rank higher in the SERPs.
They also use questionable metrics to rank sites – for example, one of the Content Warehouse's modules uses Chrome views as an indicator of a website's quality, so all other factors being equal, sites that get more visits from Chrome users will be ranked higher.
Many professional SEO experts and analysts will likely study Google's content warehouse documentation over the coming weeks, despite it being one of the most controversial industries related to internet search. So far, Mountain View has not made any official statement about the potentially devastating leak. However, rest assured that engineers are working overtime to mitigate the impact of the leak.