Convert Text-Based Blogs into Podcasts

Use the speech synthesizer on the Macintosh to turn text RSS feeds into podcasts and import them into iTunes.

If you get addicted to podcasts, you will find yourself wishing that your regular feeds were podcasts. However, those feeds are in text. Short of getting someone to read them, how do you get them into audio? You can use a speech synthesizer that turns the text into speech.

It might not be the most pleasant way to listen to text, but if you are at the gym and you want to get the latest technology headlines, you can use a speech synthesizer to read the headlines into MP3 files for iTunes [Hack #4] .

The Code

Save this code as asmac.pl:

    #!/usr/bin/perl -w
    use LWP::Simple;
    use FileHandle;
    use Cwd;
    use strict;

    # The URL of the RSS feed

    use constant URL => "http://www.mysite.com/myrss.xml";

    # The artist name for the MP3s

    use constant ARTIST => "Artist Name";

    # The album name of the MP3s

    use constant ALBUM => "Album Name";

    # The output directory to put the MP3s into

    use constant OUTPUT_DIR => "mp3s"; 

    # Gets the feed and returns a hash of the RSS items, their titles
    # and the temporary filenames

    sub get_feed($)
    {
      my $out = {};

      my $text = get URL;

      while( $text =~ /\<item(.*?)\<\/item\>/gs )
      
    {
      my $item = $1;
      my ( $title ) = $item =~ /\<title\>(.*?)\<\/title\>/gs;
      my ( $desc ) = $item =~ /\<description\>(.*?)\<\/description\>/gs;

      $title =~ /[\n|\r]/g;

      $desc =~ s/[\n|\r]/ /g;
      $desc =~ s/$\<\!\[CDATA\[//;
      $desc =~ s/\]\]\>//;
      $desc =~ s/\<.*?\>//g;
      $desc =~ s/\<\/.*?\>//g;
      $desc =~ s/\"//g;
      $desc =~ s/\'//g;
      $desc =~ s/,//g;
      $desc = $title . ". " . $desc;

      my $filename = lc $title;
      $filename =~ s/ /_/g;
      $filename =~ s/[.]//g;
      $filename =~ s/-/_/g;
      $filename =~ s/\\//g;
      $filename =~ s/\///g;
      $filename =~ s/^\s+//;
      $filename =~ s/\s+$//;

      $out->{ $filename } = {
        description => $desc,
        title => substr($title,0,30)
      };
     }

     return $out; 
    }

    # Turns a story into speech as an AIFF file

    sub speakstory($$)
    {
      my ( $text, $filename ) = @_;

      open FH, "|osascript";
      print FH "set ofile to POSIX file \"$filename\"\n";
      print FH "say \"$text\" saving to ofile\n";
      close FH;
    }

    # Convert an AIFF file to MP3 with the right tags

    sub convert($$$)
    {
      my ( $aiffFile, $mp3File, $desc ) = @_;

print "Creating $mp3File\n";

     my $cmd = "lame $aiffFile $mp3File --silent --tt \"$desc\"";
     $cmd .= " --ta \"".ARTIST."\" --tl \"".ALBUM."\" -h";

	 system( $cmd );
	}

    # Get the feed URL and build MP3s for each of the entries

    my $items = get_feed( URL );
    foreach my $filename ( keys %$items )
    {
      speakstory( $items->{ $filename }->{description}, "temp.aiff" );
      convert( "temp.aiff", OUTPUT_DIR."/".$filename.".mp3",
        $items->{ $filename }->{ title } );
      unlink( "temp.aiff" );
    }

    # Import the files into iTunes

    print "Importing the MP3s into iTunes\n";

    open FH, "|osascript";
    print FH "set ofile to POSIX file \"".getcwd."/".OUTPUT_DIR."\"\n";
    print FH "tell application \"iTunes\" to convert ofile\n";
    close FH;

Running the Hack

To customize this script, change the URL of the feed in the Perl script, as well as the artist and album name. The output directory is just a temporary directory where the MP3 files go on their way to iTunes.

To run this hack you will need to install the LAME MP3 encoder [Hack #50] for command-line access. Download the most recent LAME .tgz file from http://lame.sf.net/ and extract it. Then, cd to the top-level directory and run ./configureto configure the LAME distribution. Next, run make to build the LAME encoder. Finally, run sudo make install to install the LAME executable in the system.

Hacking the Hack

Some blogs offer two versions of their RSS feeds: one with a truncated version of the blog entries, and another with the complete text of each entry. You will want to point this script at the full-text version so that it will read the entire contents of each blog entry from the text in the RSS feed.

If a full-text feed is not available, you will want to hack the script to get the contents of each URL pointed into the feed. Then the script will need to extract the text of the entire blog entry from the target page to get the complete text. At this point, you can pass this full text back as the description field and let the script do the job of speaking the text into MP3 files.

iSpeak It

The iSpeak It application on the Macintosh (http://zapptek.com/ispeak-it/) is the commercial version of the RSS-to-speech application.

Figure 1-14 shows the iSpeak It main window that shows the text that will be converted into speech and then added to your iTunes library. You can specify a list of RSS feeds to read, or just use the ones that it has as defaults.

News headlines pulled from RSS, ready for text-to-speech conversion

Figure 1-14. News headlines pulled from RSS, ready for text-to-speech conversion

Once the spoken news entries are encoded as sound files in your iTunes library, you can sync your iPod to iTunes and listen to the headlines on your way to work.

Admittedly, the computer speech generated by the system-standard Macintalk drivers is a little hard to take. Check Cepstral Voices (http://cepstral.com/) for some more-pleasing alternatives on Mac and Windows in the $30 range.

AutoCast

The equivalent program on Windows is called AutoCast (http://autocastsoftware.com/). The program requires the Microsoft Speech SDK Version 5.1. Full source code is available on the site.

See Also

  • “Speech Synthesize Your Podcast Introduction” [Hack #65]

Get Podcasting Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.