K-means with H2O

Here, we're comparing the K-means implementation of H2O with Scikit-learn. More specifically, we will run the mini-batch experiment using H2OKMeansEstimator, the object for K-means available in H2O. The setup is similar to the one shown in the PCA with H2O section, and the experiment is the same as seen in the preceding section:

In:import h2o from h2o.estimators.kmeans import H2OKMeansEstimator h2o.init(max_mem_size_GB=4) def testH2O_kmeans(X, k): temp_file = tempfile.NamedTemporaryFile().name np.savetxt(temp_file, np.c_[X], delimiter=",") cls = H2OKMeansEstimator(k=k, standardize=True) blobdata = h2o.import_file(temp_file) tik = time.time() cls.train(x=range(blobdata.ncol), training_frame=blobdata) fit_time = time.time() - tik ...

Get Large Scale Machine Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.