Datasets

One of the most well-known repositories of machine learning datasets is hosted by the University of California Irvine. The UCI repository contains over 300 datasets covering a wide variety of challenges, including poker, movies, wine quality, activity recognition, stocks, taxi service trajectories, advertisements, and many others. Each dataset is usually equipped with a research paper where the dataset was used, which can give you a hint on how to start and what the prediction baseline is.

The UCI machine-learning repository can be accessed at https://archive.ics.uci.edu, as follows:

Another well-maintained collection by Xiaming Chen ...

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.