🏘️

K-Nearest Neighbors

About


(This page is based primarily on material from An Introduction to Statistical Learning and the Wikipedia page)
 

Basics

KNN classification works as follows:

Pre-processing

  • Normalise the data!
  • Curse of dimensionality means works less well in high-dimensional spaces, as distances are similar throughout.
    • Common to employ feature extraction (e.g. PCA) prior to KNN

Notes

  • Common to use weighting scheme for each item, e.g. 1/d
  • For discrete variables, distance can be e.g. hamming distance
  • Can be extended to regression using (weighted) average of neighbour feature outputs