Part II: HASHING FOR STORAGE: DATA MANAGEMENT

CHAPTER 10

Record Chaining with Hash Tables

10.1 SEPARATE CHAINING OF RECORDS

The cost of SEARCH, as measured by the number of comparisons, when N keys are stored as in Example 7.1, in a single chain x1D49E_EuclidMathOne_10n_000100 is O(N). A significant improvement might be expected by combining record chaining with hashing, which results in chains rooted at hash-table cells. When a single chain of N records is replaced by r chains x1D49E_EuclidMathOne_10n_0001000, x1D49E_EuclidMathOne_10n_0001001, ··· , x1D49E_EuclidMathOne_10n_000100r−1 of lengths L0, L1, ··· , Lr−1 with N = L0 + L1 + ··· + Lr−1, the number of comparisons required in SEARCH or INSERT is reduced.

Knuth [Knuth 1973, pp. 513–14] referred to the merge of hashing and chaining as separate chaining; the idea was invented by H.P. Luhn in 1953 (see Chapter 8) and described in memoranda dated 1/3/53 and 2/20/53 to Mr. J. C. McPherson, who then an IBM Vice-President, Luhn wrote1

When in loading the system, an address is found to be occupied by a previous entry, the address number of the next auxilliary address is recorded behind the original address and the new item is recorded at this auxilliary address. ...

Get Hashing in Computer Science: Fifty Years of Slicing and Dicing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.