11.15. Program: Finding Fresh Links

Example 11-6, fresh-links.php, is a modification of the program in Recipe 11.15 that produces a list of links and their last modified time. If the server on which a URL lives doesn’t provide a last modified time, the program reports the URL’s last modified time as the time the URL was requested. If the program can’t retrieve the URL successfully, it prints out the status code it got when it tried to retrieve the URL. Run the program by passing it a URL to scan for links:

% fresh-links.php http://www.oreilly.com
http://www.oreilly.com/index.html: Fri Aug 16 16:48:34 2002
http://www.oreillynet.com: Mon Aug 19 10:18:54 2002
http://conferences.oreilly.com: Fri Aug 16 19:41:46 2002
http://international.oreilly.com: Fri Mar 29 18:06:32 2002
http://safari.oreilly.com: 302
http://www.oreilly.com/catalog/search.html: Tue Apr  2 19:05:57 2002
http://www.oreilly.com/oreilly/press/: 302
...

This output is from a run of the program at about 10:20 A.M. EDT on August 19, 2002. The link to http://www.oreillynet.com is very fresh, but the others are of varying ages. The link to http://www.oreilly.com/oreilly/press/ doesn’t have a last modified time next to it; it has instead, an HTTP status code (302). This means it’s been moved elsewhere, as reported by the output of stale-links.php in Recipe 11.15.

The program to find fresh links is conceptually almost identical to the program to find stale links. It uses the same pc_link_extractor( ) function from Recipe ...

Get PHP Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.