The Script’s Data Structure

The first interesting thing in the script is the section at the top where the data structure that will hold the site’s metainformation is defined:

my %page_data;        # HoH w/ primary keys: path
                      #        secondary keys: page template params
                      #        values: corresponding content

my %teacher_students; # HoL, w/ keys of teacher short_name and values
                      # of arrays of student page paths.

my %student_profiles; # HoHoL, w/ primary keys of teacher/student 
                      # string, secondary key of @attribute element, 
                      # and values of leader page paths.
                      
my %cat_leaders;      # HoL, w/ keys of leader cat short_names, and 
                      # values of arrays of leader page paths.

The comments here make use of a form of shorthand popular with Perl programmers when they are talking about multilevel data structures. HoH means “hash of hashes,” HoL means “hash of lists” (another name for a "hash of arrays”), and the jolly-sounding HoHoL means “hash of hash of lists.”

The %page_data hash of hashes is really the heart of the script. It is built up during the script’s first cycle through the site’s HTML pages. In effect, it is a little database that embodies all the META headers, TITLE tags, and comment-delimited content blocks of the site’s HTML pages. That hash of hashes has a first-level key consisting of the page’s path and filename, a second-level key of the name of the page attribute, and a value of the corresponding content. That is, for the Al Gore leader page whose META headers we looked at earlier, the entry ...

Get Perl for Web Site Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.