Clustering words by their lexemes

Words that look alike can easily be clustered together. The clustering algorithm in the lexeme-clustering package is based on Janicki's research paper titled, "A Lexeme-Clustering Algorithm for Unsupervised Learning of Morphology". A direct link to this paper can be found through the following URL: http://skil.informatik.uni-leipzig.de/blog/wp-content/uploads/proceedings/2012/Janicki2012.37.pdf.

Getting ready

An Internet connection is necessary for this recipe to download the package from GitHub.

How to do it…

Follow these steps to install and use the library:

  1. Obtain the lexeme-clustering library from GitHub. If Git is installed, enter the following command, otherwise download it from https://github.com/BinRoot/lexeme-clustering/archive/master.zip ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.