As a recap, here's what we did so far—we processed each tweet in a list of tweets from the home timeline to be a slice of float64. These represent the coordinates in the higher-dimensional space. Now, all we need to do is the following:
- Create a clusterer.
- Create a [][]float64 representing all the tweets from the timeline.
- Train the clusterer.
- Predict which tweet belongs in which cluster.
It can be done as follows:
func main() { tweets := mock() p := newProcessor() p.process(tweets)// create a clusterer c, err := clusters.KMeans(10000, 25, clusters.EuclideanDistance) dieIfErr(err)data := asMatrix(tweets) dieIfErr(c.Learn(data))clusters := c.Guesses() for i, clust := range clusters{ fmt.Printf("%d: %q\n", clust, ...