The W3C Validator to RSS
Of all the tasks of Hercules, the one where he had to keep his web site’s XHTML validated was the hardest. Without wanting to approach the whole Valid XHTML Controversy, we can still safely say that keeping a site validated is a pain. You have to validate your code, most commonly using the W3C validator service at http://validator.w3.org, and you have to keep going back there to make sure nothing has broken.
You have to do that unless, of course, you’re subscribed to a feed of validation results. This script does just that, providing an RSS interface to the W3C validator.
You pass the URL you want to test as a query in the feed URL, like
so:
http://www.example.org/validator.cgi?url=http://www.example.org/index.html
.
Walking Through the Code
We’re using the traditional Perl start plus
LWP::Simple
and XML::Simple
,
which will parse the results coming back from the validator. Note
that, in the classic gotcha, LWP::Simple
and
CGI
clash, so we have to add those additional
flags to prevent a type mismatch.
use warnings; use strict; use XML::RSS; use CGI qw(:standard); use LWP::Simple 'get'; use XML::Simple;
Now, grab the URL from the query string, and use
LWP::Simple
to retrieve the results. The W3C
provides an XML output mode for the validator, and this is what
we’re using here. It is, however, classed as beta
and flakey, and might not always work.
my $cgi = CGI::new( ); my $url = $cgi->param('url'); my $validator_results_in_xml = get("http://validator.w3.org/check?uri=$url;output=xml"); ...
Get Developing Feeds with RSS and Atom now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.