MALLET

The Machine Learning for Language Toolkit (MALLET) is a large library of natural language processing algorithms and utilities. It can be used in a variety of tasks such as document classification, document clustering, information extraction, and topic modelling. It features a command-line interface as well as a Java API for several algorithms such as Naive Bayes, HMM, Latent Dirichlet topic models, logistic regression, and conditional random fields.

MALLET is available under the Common Public License 1.0, which means that you can even use it in commercial applications. It can be downloaded from http://mallet.cs.umass.edu. A MALLET instance is represented by name, label, data, and source. However, there are two methods to import data ...

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.