Finding broken links in a website

Some people manually check every page on a website to search for broken links. It is feasible for websites having very few pages, but gets difficult when the number of pages become large. It becomes really easy if we can automate the process of finding broken links. We can find the broken links by using HTTP manipulation tools. Let's see how to do it.

Getting ready

To identify the links and find the broken ones from the links, we can use lynx and curl. It has an option, namely -traversal, which will recursively visit pages on the website and build a list of all hyperlinks in the website. We can use cURL to verify each of the links for whether they're broken or not.

How to do it...

Let's write a Bash script with the ...

Get Linux Shell Scripting Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.