O'Reilly logo

Natural Language Processing with Java and LingPipe Cookbook by Krishna Dayanidhi, Breck Baldwin

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Interesting phrase detection

Imagine that a program can take a bunch of text data and automatically find the interesting parts, where "interesting" means that the word or phrase occurs more often than expected. It has a very nice property—no training data is needed, and it works for any language that we have tokens for. You have seen this most often in tag clouds such as the one in the following figure:

Interesting phrase detection

The preceding figure shows a tag cloud generated for the lingpipe.com home page. However, be aware that tag clouds are considered to be the "mullets of the Internet" as noted by Jeffery Zeldman in http://www.zeldman.com/daily/0405d.shtml, so you will ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required