Creating RSS 0.9x Feeds

RSS 0.91 and 0.92 feeds are created in the same way — the additional elements found in 0.92 are well handled by the existing RSS tools.

Of course, you can always hand-code your RSS feed. Doing so certainly gets you on top of the standard, but it’s neither convenient, quick, nor recommended. Ordinarily, feeds are created by a small program in one of the scripting languages: Perl, PHP, Python, etc. Many CMSs already create RSS feeds automatically, but you may want to create a feed in another context. Hey, you might even write your own CMS!

There are various ways to create a feed, all of which are used in real life:

XML transformation

Running a transformation on an XML master document to convert the relevant parts into RSS. This technique is used in Apache Axkit-based systems, for example.

Templates

Substituting values within a RSS feed template. This technique is used within the Movable Type weblogging platform, for example.

An RSS-specific module or class

Used within hundreds of little ad hoc scripts across the Net, for example.

We’ll look at all three of these methods, but let’s start with the third, using an RSS-specific module. In this case, it’s Perl’s XML::RSS.

Creating RSS with Perl Using XML::RSS

Jonathan Eisenzopf’s XML::RSS module for Perl is one of the key tools in the Perl RSS world. It is built on top of XML::Parser — the basis for many Perl XML modules — and it is object-oriented. Actually, XML::RSS also supports both creating RSS 1.0 and parsing existing feeds, but in this section we will deal only with its 0.91 creation capabilities. Currently, it does not support the additional elements within RSS 0.92.

Example 4-4 shows a simple Perl script that creates the feed shown in Example 4-5.

Example 4-4. A sample XML::RSS script

#!/usr/local/bin/perl -w
   
## Chapter 4, Example 1.
## Create an example RSS 0.91 feed
   
use XML::RSS;
   
my $rss = new XML::RSS (version => '0.91');
   
$rss->channel(title          => 'The Title of the Feed',
              link           => 'http://www.oreilly.com/example/',
              language       => 'en', 
              description    => 'An example feed created by XML::RSS',
              lastBuildDate  => 'Tue, 04 Jun 2002 16:20:26 GMT',
              docs           => 'http://backend.userland.com/rss092',
              );
   
$rss->image(title       => 'Oreilly',
            url         => 'http://meerkat.oreillynet.com/icons/meerkat-powered.jpg',
            link        => 'http://www.oreilly.com/example/',
            width       => 88,
            height      => 31,
            description => 'A nice logo for the feed'
            );
   
$rss->textinput(title => "Search",
                description => "Search the site",
                name  => "query",
                link  => "http://www.oreilly.com/example/search.cgi"
                );
   
$rss->add_item( title => "Example Entry 1",
                link  => "http://www.oreilly.com/example/entry1",
                 description => 'blah blah',
               );
   
$rss->add_item( title => "Example Entry 2",
                link  => "http://www.oreilly.com/example/entry2",
                 description => 'blah blah'
               );
   
$rss->add_item( title => "Example Entry 3",
                link  => "http://www.oreilly.com/example/entry3",
                 description => 'blah blah'
               );
   
$rss->save("example.rss");

Example 4-5. The resultant RSS 0.91 feed

<?xml version="1.0" encoding="UTF-8"?>
   
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
            "http://my.netscape.com/publish/formats/rss-0.91.dtd">
   
<rss version="0.91">
   
<channel>
<title>The Title of the Feed</title>
<link>http://www.oreilly.com/example/</link>
<description>An example feed created by XML::RSS</description>
<language>en</language>
<lastBuildDate>Tue, 04 Jun 2002 16:20:26</lastBuildDate>
<docs>http://backend.userland.com/rss092</docs>
   
<image>
<title>Oreilly</title>
<url>http://meerkat.oreillynet.com/icons/meerkat-powered.jpg</url>
<link>http://www.oreilly.com/example/</link>
<width>88</width>
<height>31</height>
<description>A nice logo for the feed</description>
</image>
   
<item>
<title>Example Entry 1</title>
<link>http://www.oreilly.com/example/entry1</link>
<description>blah blah</description>
</item>
   
<item>
<title>Example Entry 2</title>
<link>http://www.oreilly.com/example/entry2</link>
<description>blah blah</description>
</item>
   
<item>
<title>Example Entry 3</title>
<link>http://www.oreilly.com/example/entry3</link>
<description>blah blah</description>
</item>
   
<textinput>
<title>Search</title>
<description>Search the site</description>
<name>query</name>
<link>http://www.oreilly.com/example/search.cgi</link>
</textinput>
   
</channel>
</rss>

After the required Perl module declaration, we create a new instance of XML::RSS, like so:

my $rss = new XML::RSS (version => '0.91');

The new method function returns a reference to the new XML::RSS object. The function can take three arguments, two of which we are interested in here:

new XML::RSS (version=>$version, encoding=>$encoding);

The version attribute refers to the version of RSS you want to make (either '0.91' or '1.0'), and the encoding attribute sets the encoding of the XML declaration. The default encoding is UTF-8.

The rest of the script is quite self-explanatory. The methods channel, image, textinput, and add_item all add new elements and associated values to the feed you are creating, and the $rss->save method saves the created feed as a file.

In Example 4-4, we’re passing known strings to the module. Therefore, it is not of much use as a script; we need to add a more dynamic form of data, or the feed will be very boring indeed.

Creating an RSS feed with the Google SOAP API

In the absence of a generalized publishing system to play with, let’s use Google’s SOAP API. This web-services interface was released with much fanfare in April 2002, and at the time of this writing it is still an experimental affair. It may even be defunct by the time you read this book, but you’ll get the idea.

The Google API requires a developer’s key. This is readily available (again, at the time of this writing) from http://www.google.com/apis — I have left it out of the code here, as daily usage is limited and I’m fond of my own. You will also need to grab Google’s WSDL file, which the SOAP::Lite module requires.

The script in Example 4-6 is designed to be run from a web browser. It takes two parameters — the query and the Google API key — so the URL would look something like this:

http://example.org/googlerss.cgi?q=queryHere&k=YourVeryOwnGoogleKeyHere

Example 4-6. googlerss.cgi Google API to RSS using Perl

#!/usr/local/bin/perl -w
use strict;
use SOAP::Lite;
use XML::RSS;
use CGI qw(:standard);
use HTML::Entities (  );
   
# Set up the query term from the cgi input
my $query = param("q");
my $key   = param("k");
   
# Initialise the SOAP interface
my $service = SOAP::Lite -> service('http://api.google.com/GoogleSearch.wsdl');
   
# Run the search
my $result = $service -> doGoogleSearch ($key, $query, 0, 10, "false", "", "false",
"", "latin1", "latin1");
   
# Create the new RSS object
my $rss = new XML::RSS (version => '0.91');
   
# Add in the RSS channel data
$rss->channel(  title  => "Google Search for $query",
                link => "http://www.google.com/search?q=$query",
                description => "Google search for $query",
             language => "en",
             );
   
#Add in the required image
$rss->image(title       => 'Google2RSS',
            url         => 'http://www.example.org/icons/google2rss.jpg',
            link        => 'http://www.google.com/search?q=$query',
            width       => 88,
            height      => 31,
            description => 'Google2RSS'
            );
   
# Create each of the items
foreach my $element (@{$result->{'resultElements'}}) {
        $rss->add_item(
                title   => HTML::Entities::encode($element->{'title'}),
                link    => HTML::Entities::encode($element->{'URL'})
                );
        }
   
# print out the RSS
print header('application/xml+rss'), $rss->as_string;

Example 4-7 shows the RSS file created by the script in Example 4-6.

Example 4-7. The resultant RSS file from the Google script, searching for RSS

<?xml version="1.0" encoding="UTF-8"?>
   
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
            "http://my.netscape.com/publish/formats/rss-0.91.dtd">
   
<rss version="0.91">
<channel>
<title>Google Search for RSS</title>
<link>http://www.google.com/search?q=RSS</link>
<description>Google search for RSS</description>
<image>
<title>Google2RSS</title>
<url>http://www.example.org/icons/google2rss.jpg</url>
<link>http://www.google.com/search?q=$query</link>
<width>88</width>
<height>31</height>
<description>Google2RSS</description>
</image>
<item>
<title>MAPS &lt;b&gt;RSS&lt;/b&gt;</title>
<link>http://work-rss.mail-abuse.org/rss/</link>
</item>
<item>
<title>Yahoo! Groups</title>
<link>http://www.purl.org/rss/1.0/</link>
</item>
<item>
<title>&lt;b&gt;RSS&lt;/b&gt; 0.92</title>
<link>http://backend.userland.com/rss092</link>
</item>
<item>
<title>&lt;b&gt;RSS&lt;/b&gt; 0.91</title>
<link>http://backend.userland.com/stories/rss091</link>
</item>
<item>
<title>Royal Statistical Society</title>
<link>http://www.rss.org.uk/</link>
</item>
<item>
<title>Latest &lt;b&gt;RSS&lt;/b&gt; News (&lt;b&gt;RSS&lt;/b&gt; Info)</title>
<link>http://blogspace.com/rss/</link>
</item>
<item>
<title>Yahoo! Groups</title>
<link>http://groups.yahoo.com/group/rss-dev/files/specification.html</link>
</item>
<item>
<title>Yahoo! Groups : &lt;b&gt;rss&lt;/b&gt;-dev</title>
<link>http://groups.yahoo.com/group/rss-dev/</link>
</item>
<item>
<title>Yahoo! Groups</title>
<link>http://groups.yahoo.com/files/rss-dev/specification.html</link>
</item>
<item>
<title>O'Reilly Network: &lt;b&gt;RSS&lt;/b&gt; DevCenter</title>
<link>http://www.oreillynet.com/rss/</link>
</item>
</channel>
</rss>

Walking through the script in Example 4-6, we see it loads the required modules and then sets up the CGI parameters. The SOAP interface is initialized, and the query is sent via the method doGoogleSearch.

At this point, $result contains the array of results returned by Google. We leave it there for a moment and initialize XML::RSS as before. We add the required channel and image details, in this case using the $query string to make the description more interesting.

Google’s SOAP API returns only ten results by default, so there is no need to add any limit to the number of item elements in the Google results. A simple foreach loop is enough to deal with the results.

But beware! Google’s results contain HTML that has not been entity-encoded: we have to whiz the relevant data through HTML::Entity::encode , or the angle brackets will come out unencoded. Unencoded brackets are not allowed in any form of RSS. (For a complete run-down of correct XML form, see Appendix A.)

After that, it’s really just a matter of returning the RSS in the correct manner. Note that we give the returned file a MIME type of application/xml+rss — the emergent standard.

So there it is: a dynamically created RSS feed from a SOAP interface. Other inputs could be included, obviously. For example, we could include a few lines to add a lastBuildDate.

When we move on to RSS 1.0, we’ll look at building RSS feeds from multiple data sources, but for that we will have to wait for Chapter 6.

Because of its relatively limited nature, RSS 0.9x tends to be used for simple feeds of simple content. Therefore, RSS 0.9x is usually created automatically by the CMS (blogging software is a prime example).

Get Content Syndication with RSS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.