O'Reilly logo

Mastering Concurrency Programming with Java 8 by Javier Fernández González

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

An example of a document clustering application

This application will read a set of documents and will organize them using the k-means clustering algorithms. To achieve this, we will use four components:

  • The Reader system: This system will read all the documents and convert every document into a list of String objects.
  • The Indexer system: This system will process the documents and convert them into a list of words. At the same time, it will generate the global vocabulary of the set of documents with all the words that appear on them.
  • The Mapper system: This system will convert each list of words into a mathematical representation using the vector space model. The value of each item will be the Tf-Idf (short for term frequency–inverse document frequency ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required