Another technique used in data mining is clustering. SciPy has two modules to deal with any problem in this field, each of them addressing a different clustering tool –
scipy.cluster.vq for k-means and
scipy.cluster.hierarchy for hierarchical clustering.
We have two routines to divide data into clusters using the k-means technique –
kmeans2. They correspond to two different implementations.
The former has a very simple syntax:
kmeans(obs, k_or_guess, iter=20, thresh=1e-05)
obs parameter is an
ndarray with the data we wish to cluster. If the dimensions of the array are m x n, the algorithm interprets this data as m points in the n-dimensional Euclidean space. If we know the number of clusters ...