As a last bit of transformation, we would need to standardize our input data. This allow us to compare models to see if one model is better than another. To do so, I wrote two different scaling algorithms:
func scale(a [][]float64, j int) { l, m, h := iqr(a, 0.25, 0.75, j) s := h - l if s == 0 { s = 1 } for _, row := range a { row[j] = (row[j] - m) / s }}func scaleStd(a [][]float64, j int) { var mean, variance, n float64 for _, row := range a { mean += row[j] n++ } mean /= n for _, row := range a { variance += (row[j] - mean) * (row[j] - mean) } variance /= (n-1) for _, row := range a { row[j] = (row[j] - mean) / variance }}
If you come from the Python world of data science, the first scale function is essentially what scikits-learn's ...