2.17. Calculating Soundex
Problem
You need the
Soundex
code of a word or a name.
Solution
Use Jakarta Commons Codec’s
Soundex
. Supply a surname or a word, and
Soundex
will produce a phonetic encoding:
// Required import declaration import org.apache.commons.codec.language.Soundex; // Code body Soundex soundex = new Soundex( ); String obrienSoundex = soundex.soundex( "O'Brien" ); String obrianSoundex = soundex.soundex( "O'Brian" ); String obryanSoundex = soundex.soundex( "O'Bryan" ); System.out.println( "O'Brien soundex: " + obrienSoundex ); System.out.println( "O'Brian soundex: " + obrianSoundex ); System.out.println( "O'Bryan soundex: " + obryanSoundex );
This will produce the following output for three similar surnames:
O'Brien soundex: O165 O'Brian soundex: O165 O'Bryan soundex: O165
Discussion
Soundex.soundex( )
takes a string, preserves the
first letter as a letter code, and proceeds to calculate a code based
on consonants contained in a string. So, names such as
“O’Bryan,”
“O’Brien,” and
“O’Brian,” all
being common variants of the Irish surname, are given the same
encoding: “O165.” The 1 corresponds
to the B, the 6 corresponds to the R, and the 5 corresponds to the N;
vowels are discarded from a string before the
Soundex
code is generated.
The Soundex
algorithm can be used in a number of
situations, but Soundex
is usually associated with
surnames, as the United States historical census records are indexed
using Soundex
. In addition to the role
Soundex
plays in the census, ...
Get Jakarta Commons Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.