In a recent LinkedIn post, Google analyst Gary Illyes raised awareness about two issues plaguing web crawlers: soft 404s and other “cryptographic” errors.
These seemingly harmless mistakes can have a detrimental effect on your SEO efforts.
Understanding Soft 404s
Soft 404 errors occur when a web server returns the standard “200 OK” HTTP status code for a page that doesn't exist or contains an error message, misleading web crawlers and causing them to waste resources on non-existent or useless content.
Illys likened the experience to visiting a coffee shop where not every item on the menu is available — a scenario that may be frustrating for a human customer, but poses a bigger problem for web crawlers.
Illies explains:
“Crawlers use the status code to interpret whether a fetch was successful, even if the page basically only contains an error message. Crawlers can waste resources by returning to the same page multiple times. If you have a lot of such pages, the resource costs grow exponentially.”
The Hidden Cost of Soft Errors
The impact of soft 404 errors goes beyond inefficient use of crawler resources.
Illyes said these pages are filtered out during indexing, so they're unlikely to show up in search results.
To address this issue, Illyes recommends serving an appropriate HTTP status code when an error occurs on the server or client.
This allows the crawler to understand the situation and allocate resources more effectively.
Illies also warned against rate-limiting crawlers with messages like “You're making too many requests, we'll slow you down” because crawlers cannot interpret such text-based instructions.
Why is SEJ interested?
Soft 404 errors can affect the crawlability and indexing of your website.
Addressing these issues allows crawlers to focus on fetching and indexing pages with valuable content, potentially improving your site's visibility in search results.
Eliminating soft 404 errors allows for more efficient use of server resources, as crawlers won't waste bandwidth by repeatedly visiting error pages.
How this helps
To identify and resolve soft 404 errors on your website, consider the following steps:
- Regularly monitor your website's crawl reports and logs to identify pages that contain error messages but still return an HTTP 200 status code.
- Implement proper error handling on your server, making sure that error pages are served with the appropriate HTTP status code (404 for not found, 410 for permanently deleted, etc.).
- Use tools like Google Search Console to monitor the scope of your site and identify pages that are flagged as soft 404 errors.
By proactively addressing soft 404 errors, you can improve your website’s crawlability, indexing, and SEO.
Featured image: Julia Tim/Shutterstock