🏘️

K-Nearest Neighbors

Contents

Basics Pre-processing Notes

About

(This page is based primarily on material from An Introduction to Statistical Learning and the Wikipedia page)

Basics

KNN classification works as follows:

Pre-processing

Normalise the data!

Curse of dimensionality means works less well in high-dimensional spaces, as distances are similar throughout.

Common to employ feature extraction (e.g. PCA) prior to KNN

Notes

Common to use weighting scheme for each item, e.g. 1/d

For discrete variables, distance can be e.g. hamming distance

Can be extended to regression using (weighted) average of neighbour feature outputs