2.12. Measuring the Frequency of a String

Problem

You need to find out how many times a certain word or piece of text occurs in a string.

Solution

StringUtils.countMatches() returns the frequency of a piece of text within another string:

File manuscriptFile = new File("manuscript.txt");
Reader reader = new FileReader( manuscriptFile );
StringWriter stringWriter = new StringWriter( );
while( reader.ready( ) ) { writer.write( reader.read( ) ); }
String manuscript = stringWriter.toString( );

// Convert string to lowercase
manuscript = StringUtils.lowerCase(manuscript);

// count the occurrences of "futility"
int numFutility = StringUtils.countMatches( manuscript, "futility" );

Converting the entire string to lowercase ensures that all occurrences of the word “futility” are counted, regardless of capitalization. This code executes and numFutility will contain the number of occurrences of the word “futility.”

Discussion

If the manuscript.txt file is large, it makes more sense to search this file one line at a time, and sum the number of matches as each line is read. A more efficient implementation of the previous example would look like this:

File manuscriptFile = new File("manuscript.txt");
Reader reader = new FileReader( manuscriptFile );
LineNumberReader lineReader = new LineNumberReader( reader );
int numOccurences = 0;

while( lineReader.ready( ) ) { 
    String line = StringUtils.lowerCase( lineReader.readLine( ) );
    numOccurences += StringUtils.countMatches( , "futility" );
}

Your random ...

Get Jakarta Commons Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.