K-Nearest Neighbors


(This page is based primarily on material from An Introduction to Statistical Learning and the Wikipedia page)


KNN classification works as follows:


  • Normalise the data!
  • Curse of dimensionality means works less well in high-dimensional spaces, as distances are similar throughout.
    • Common to employ feature extraction (e.g. PCA) prior to KNN


  • Common to use weighting scheme for each item, e.g. 1/d
  • For discrete variables, distance can be e.g. hamming distance
  • Can be extended to regression using (weighted) average of neighbour feature outputs