The W3C Validator to RSS

Of all the tasks of Hercules, the one where he had to keep his web site’s XHTML validated was the hardest. Without wanting to approach the whole Valid XHTML Controversy, we can still safely say that keeping a site validated is a pain. You have to validate your code, most commonly using the W3C validator service at http://validator.w3.org, and you have to keep going back there to make sure nothing has broken.

You have to do that unless, of course, you’re subscribed to a feed of validation results. This script does just that, providing an RSS interface to the W3C validator.

You pass the URL you want to test as a query in the feed URL, like so: http://www.example.org/validator.cgi?url=http://www.example.org/index.html.

Walking Through the Code

We’re using the traditional Perl start plus LWP::Simple and XML::Simple, which will parse the results coming back from the validator. Note that, in the classic gotcha, LWP::Simple and CGI clash, so we have to add those additional flags to prevent a type mismatch.

use warnings;
use strict;
use XML::RSS;
use CGI qw(:standard);
use LWP::Simple 'get';
use XML::Simple;

Now, grab the URL from the query string, and use LWP::Simple to retrieve the results. The W3C provides an XML output mode for the validator, and this is what we’re using here. It is, however, classed as beta and flakey, and might not always work.

my $cgi = CGI::new( ); my $url = $cgi->param('url'); my $validator_results_in_xml = get("http://validator.w3.org/check?uri=$url;output=xml"); ...

Get Developing Feeds with RSS and Atom now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.