Chaining multiple values into a hash

One of the bigger problems when using a DBM file with the storage mechanism of DB_HASH is that the keys against which the data is stored must be unique. For example, if we stored two different values with the key of ``Wiltshire,'' say for Stonehenge and Avebury, generally the last value inserted into the hash would get stored in the database. This is a bit problematic, to say the least.

In a good database design, the primary key of any data structure generally should be unique in order to speed up searches. But quick and dirty databases, badly designed ones, or databases with a suboptimal data quality may not be able to enforce this uniqueness. Similarly, using referential hashtables to provide nonprimary key searching of the database also triggers this problem.

A Perl solution to this problem is to push the multiple values onto an array that is stored within the hash element. This technique works fine while the program is running, because the array references are still valid, but when the database is written out and reloaded, the data is invalid.

Therefore, to solve this problem, we need to look at using the different Berkeley DB storage management method of DB_BTREE , which orders its keys prior to insertion. With this mechanism, it is possible to have duplicate keys, because the underlying DBM file is in the form of an array rather than a hashtable. Fortunately, you still reference the DBM file via a Perl hashtable, so DB_BTREE is not any ...

Get Programming the Perl DBI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.