About
(This page is based primarily on material from An Introduction to Statistical Learning and the Wikipedia page)
Basics
KNN classification works as follows:
Pre-processing
- Normalise the data!
- Curse of dimensionality means works less well in high-dimensional spaces, as distances are similar throughout.
- Common to employ feature extraction (e.g. PCA) prior to KNN
Notes
- Common to use weighting scheme for each item, e.g. 1/d
- For discrete variables, distance can be e.g. hamming distance
- Can be extended to regression using (weighted) average of neighbour feature outputs