Prev: Computation of cross correlation between two signals
Next: Lists: Row Vectors vs. Column Vectors. (feels like such a silly
From: Dário Abdulrehman on 7 Jun 2010 08:06 This is my first attempt at writing Mathematica code but I am getting strange results which are probably due to some bug I cannot detect. This code creates a test set with 2 classes and size 1000 from a Bivariate Gaussian distribution then creates 6 training sets with 2 classes and sizes 10^i, i = 1..6. Then I run the Nearest neighbor algorithm on each train set and test set and compute the error rate. However, as you can see from the table at the end, I get error rates that don't make much sense. I might as well flip a coin instead of running the algorithm. Unfortunately I cannot spot the bug in the code. Thanks. **************************************************** (* Code *) Needs["MultivariateStatistics`"]; m = 6; testSize = 1000; MN1=MultinormalDistribution[{0.5,0.5},(1 0 0 1 )]; MN2=MultinormalDistribution[{-0.5,-0.5},(1 0 0 1 )]; RandomVector[n_]:=Join[Array[RandomReal[MN1]&,n/2], Array[RandomReal[MN2]&,n/2]]; testSet = RandomVector[testSize]; trainingSets=Map[Function[x,RandomVector[x]],NestList[10 #&,10,m-1]]; classOf[i_] = If[i<=(testSize/2),1,2]; NN[trainingSet_]:=Module[{nnFunc=Nearest[trainingSet->Automatic]}, N[Fold[Plus,0,MapIndexed[If [classOf[First[nnFunc[#1]]]!=classOf[First[#2]],1,0]&,testSet]]/testSize]] Grid[{Prepend[NestList[10 #&,10,m-1],"m"],Prepend[Map[Function[trainingSet,NN[trainingSet]],trainingSets],"error rate"]},Frame->All] m 10 100 1000 10000 100000 1000000 error rate 0.5 0.5 0.322 0.484 0.499 0.501 |