By definition, in order to cluster data points into groups, we require an understanding of the distance between two given data points. A common measure of distance is the Euclidean distance, which is simply the straight-line distance between two given points in k-dimensional space, where k is the number of independent variables or features. Formally, the Euclidean distance between two points, p and q, given k independent variables or dimensions is defined as follows:
Other common measures of distance include the Manhattan distance, which is the sum of the absolute values instead of squares ( ) and the maximum coordinate ...