Producing longer Soundex codes

Soundex coding is an abstract way to represent the sound of a word; it was invented to help identify when the same name might be spelt differently over a period of time, but is used more generally to help identify variant spellings of the same name or word. For example spelled and spelt would have the same Soundex code.

Normally, a Soundex code represents a word by its initial letter capitalized, followed by three numeric digits (0-6) representing groups of letters that might be substituted for one another. The numeric digit codes correspond to letters of the alphabet as follows:

Numeric Digit Group Code

Letters

1

B P F V

2

C S K G J Q X Z

3

D T

4

L

5

M N

6

R

All other letters (vowels plus Y, H & W) are ...

Get IBM SPSS Modeler Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.