Getting the data

At the KDD Cup web page (http://kdd.org/kdd-cup/view/kdd-cup-2009/Data), you should see a page that looks similar to the following screenshot. First, under the Small version (230 var.) header, download orange_small_train.data.zip. Next, download the three sets of true labels associated with this training data. The following files are found under the Real binary targets (small) header:

  • orange_small_train_appentency.labels
  • orange_small_train_churn.labels
  • orange_small_train_upselling.labels

Save and unzip all of the files marked in the red boxes, as shown in the screenshot:

In the following sections, first, we will load the ...

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.