How to do it...

  1. We downloaded the prepared data file in LIBSVM from: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/glass.scale

The dataset contains 11 features and 214 rows.

  1. The original dataset and data dictionary is also available at the UCI website: http://archive.ics.uci.edu/ml/datasets/Glass+Identification
    • ID number: 1 to 214
    • RI: Refractive index
    • Na: Sodium (unit measurement: weight percent in corresponding oxide, as are attributes 4-10)
    • Mg: Magnesium
    • Al: Aluminum
    • Si: Silicon
    • K: Potassium
    • Ca: Calcium
    • Ba: Barium
    • Fe: Iron

Type of glass: Will find our class attributes or clusters using BisectingKMeans():

  • building_windows_float_processed
  • building_windows_non-_float_processed
  • vehicle_windows_float_processed

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.