Identifying key words in a corpus of text

One way to predict the topic of a paragraph or sentence is by identifying what the words mean. While the parts of speech give some insight about each word, they still don't reveal the connotation of that word. In this recipe, we will use a Haskell library to tag words by topics such as PERSON, CITY, DATE, and so on.

Getting ready

An Internet connection is necessary for this recipe to download the sequor package.

Install it from cabal:

$ cabal install sequor --prefix=`pwd`

Otherwise, follow these directions to install it manually:

  1. Obtain the latest version of the sequor library by opening up a browser and visiting the following URL: http://hackage.haskell.org/package/sequor.
  2. Under the Downloads section, download ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.