Parsing the Category File

The next step is parsing the information contained in the category file, category.txt. Since that file has a simpler structure than exhibit.txt, this part of the script will be easier.

In terms of a data structure to store the information, though, there’s a complication. If you look at category.txt, you’ll see that it contains lines with category names, formatted with double square brackets surrounding them, like this:

[[category name]]

Then, for each category, there are one or more lines containing company names and booth numbers, formatted like this:

company name, booth number

So, we might initially think we want a hash, with keys consisting of the different category names. For values, though, we won’t be able to use a simple scalar value, because what we want to store for each category is a list of exhibitors.

The ideal data structure would be a hash of lists -- that is, a hash where each key would be a category name and the corresponding value would be a list (or an array) containing the company names that go with that category. Unfortunately, we don’t know how to do that yet. For now, we can fake it by using hash values that consist of a list of company names separated by newline characters. That is, the value associated with each category name key will still be a scalar value, but that scalar value will be a string containing company names on separate lines, like this:

first company name
second company name
third company name

And so on. Once we’ve created ...

Get Perl for Web Site Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Perl for Web Site Management by John Callender

Parsing the Category File

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly