MLCC - Laboratory 1 - Local methods


This lab is about local methods for binary classification on synthetic data. The goal of the lab is to get familiar with the kNN algorithm and to get a practical grasp of what we have discussed in class. Follow the instructions below. Think hard before you call the instructors!

Download:

1. Warm up - data generation

Open the matlab file MixGauss.m

[X1, Y1] = MixGauss([[0;0],[1;1]],[0.5,0.25],50);
figure(1); scatter(X1(:,1),X1(:,2),50,Y1,'filled'); %type "help scatter" to see what the parameters mean
title('dataset 1');

2. Core - kNN classifier

The k-Nearest Neighbors algorithm (kNN) assigns to a test point the most frequent label of its k closest examples in the training set.

figure;
scatter(Xts(:,1),Xts(:,2),50,Yts,'filled'); %plot test points (filled circles) associating a different color to each "true" label
hold on
scatter(Xts(:,1),Xts(:,2),200,Ypred,'x'); % plot test points (empty circles) associating a different color to each estimated label
Matlab line: sum(Ypred~=Yts)./Nts %Nt number of test data

3. Parameter selection - What is a good value for k?

So far we considered an arbitrary k...

4. If you have time - More experiments