Part II: HASHING FOR STORAGE: DATA MANAGEMENT

CHAPTER 7

Basic Concepts

7.1 OVERVIEW OF THE RECORDS MANAGEMENT PROBLEM

The mechanics of using hashing in storage management is examined in this chapter. Storage can mean either the computer main random-access storage or file systems, which is also known as pervasive storage.

We view information in storage as composed of records {Ri}. The records may be of variable sizes and are located in the system storage media (typically a disk) in a variety of regions, especially in dynamic file systems, where additions and deletions of records make it difficult and impractical to insist on contiguity. Using the information means gaining access to the records and performing operations on records.

A unique key is associated with and stored in each record. This key could be a name or a suitable number, but in general, any string can serve as a unique identifier. It is used by the programs processing the data to locate (address) the desired record. When a telephone directory is searched for a telephone number, the process is somewhat easier than the general instance of file management because the records composed of the triple (Name_Address_TelephoneNumber) are stored sorted on the Name field.

File management is complicated because most data sets are dynamic rather than static, meaning existing records are updated or deleted and new records are added.

The set of program keywords recognized by a compiler is fixed; the list of variable names it needs ...

Get Hashing in Computer Science: Fifty Years of Slicing and Dicing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.