O'Reilly logo

R Machine Learning Essentials by Michele Usuelli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Optimizing the k-nearest neighbor algorithm

We built our KNN model using 37 features that have a different relevance to the language. Given a new flag, its neighbors are the flags sharing a lot of attributes, regardless of their relevance. If a flag has different common attributes that are irrelevant to the language, we erroneously include it in the neighborhood. On the other hand, if a flag shares a few highly-relevant attributes, it won't be included.

KNN performs worse in the presence of irrelevant attributes. This fact is called the curse of dimensionality and it's quite common in machine learning algorithms. A solution to the curse of dimensionality is to rank the features on the basis of their relevance and to select the most relevant. Another ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required