Finding Broken Links
Link rot is a fact of life on the web, but all hope is not lost. InterroBot can automate the discovery of broken links, so you don't have to hear about it from a confused user, or worse, never hear of it at all.
It goes beyond user experience. Google has stated that internal link structure is a critical ranking factor. The reality is that broken links in your web content can lead to lost search traffic.
Broken links and missing media are both easy-to-locate problems when assisted by a web crawler such as InterroBot. There are two ways to identify broken links: using the Link Checker report is the recommended way to generate a spreadsheet of issues fast, while leveraging search is better for self-styled approaches to link cleanup. Once you've identified the troublespots, there are various strategies for fixing links.
Using the Link Checker Report
The Link Checker is one of InterroBot's Core Reports. It is available out of the box. The report is run after crawling, and a spreadsheet of broken links is generated. From there, you would generally work off the spreadsheet in your CMS admin.
For sites with thousands of pages, prioritize fixing broken links on high-authority pages first—pages that receive the most traffic or have the strongest internal link profiles. This approach maximizes the SEO value of your repair efforts. Learn more about analyzing page importance in Webpages and Assets.
Locating Broken Resources with Search
From the project's search page, there are two canned searches, the buttons Client Errors and Server Errors located under the search form. These results are the direct way to filter down missing and broken web content.
Client Errors. Filters to client HTTP errors (400-451). While 404 Not Found errors tend to dominate the results, 403 Forbidden and 400 Bad Request (often poorly pasted/keyed links) are regulars.
Server Errors. Filters to server HTTP errors (500-511+). These include the dreaded 500 Internal Server Error, along with canonical and custom error codes dealing with timeouts, SSL certificate failures, and more.
For more powerful filtering options, including combinations of status codes with specific URL patterns or content types, see Advanced Search.
Locating References to the Broken Resource
Client and server errors occur not only on linked webpages but on linked assets as well. Images, JavaScript, and CSS files are all capable of producing HTTP errors.
The easiest way to identify pages that contain inbound links to the problematic resource is to click on the result and look at the Inlinks panel from the context of the broken resource.
The pages under Inlinks will contain the href/src attributes that need fixing, assuming the linked reference isn't coming back.
How to Fix the Link
In all but a handful of cases, these HTTP error responses are not what your users expected. On their end, the experience is that of a "bad link." The fix? Well, that's up to you. Here are some common approaches.
- Fix the web server/application
- If a server error, and it's on your site
- If the problem is widespread due to a webpage template/theme
- Bring the content back online
- If it was misplaced/unpublished/what-have-you
- Add a HTTP redirect, forwarding to similar content
- If you are an SEO maximalist
- 301 redirects preserve link equity and pass ranking signals to the new URL
- Unlink the webpages generating errors on the inbound links side
- If the content shouldn't be linked in the first place
- If the page is intentionally retired (also consider 410 Gone)
- If the source of the error is external/outside of your control
- Find a suitable replacement URL on the inbound links side
- If link preservation is important
- If page content was shuffled in a reorganization
- Check archive.org's Wayback Machine for an old snapshot
Regular link audits should be part of your site maintenance routine. If you're managing a large site or need to automate broken link detection, explore API and Plugin Development for integration options. New to crawling? Start with Getting Started to set up your first project.
InterroBot is a web crawler and developer tool for Windows, macOS, Linux, iOS, and Android.
Want to learn more? Check out our help section or
download the latest build.