Genetic algorithm with bitstring [Matlab]

Prev: Steady State with SimBiology
Next: Contour recognition pattern

From: Lennart van Luijk on 7 May 2010 10:23

I have a problem with implementing a genetic algorithm (first time for me in Matlab & the optimtool)

I have a matrix of 756 columns and 25000 rows. Where the columns represent locations, the rows represent time and the actual values are (hourly summed) measured quantities. Now I sum over the columns to get 25000 values over time (34 months of hourly values) which I will call the 'global balance'. Over time I see a pretty steady behavior (balance around 0) but at certain points in time there are some "anomalies". Now I want to find out which of the locations/columns cause these anomalies (I suspect that out of the 756 locations only about 10-20 are actually causing this). I want to use a genetic algorithm for this.

I've created a bitstring creation function which generates a matrix for which columns to select as follows:

function Population = genalgCreationFnc_mp(GenomeLength,FitnessFcn,options)
disp('Creating population..');

%Maximum no. of MPs that can cause the anomaly together! (10)
maxIndivs = 10;
GenomeLength = 756;

totalPopulation = sum(options.PopulationSize);
initPopProvided = size(options.InitialPopulation,1);
individualsToCreate = totalPopulation - initPopProvided;

Population = zeros(totalPopulation,GenomeLength);

if initPopProvided > 0
Population(1:initPopProvided,:) = options.InitialPopulation;
end

for i=1:individualsToCreate
aant = randi(maxIndivs);
locs = randi(GenomeLength,aant,1);
while length(unique(locs)) < aant
disp('Replace equal values, generate again...')
locs = randi(GenomeLength,aant,1);
end
Population(initPopProvided+i,:) = zeros(GenomeLength,1);
Population(initPopProvided+i,locs) = 1;
end

============================

Then I evaluate the fitness as follows (only the essential part of the code):

function scores = gbfitness(population)

disp('Running fitness test...');

scores = zeros(size(population,1),1);

for i=1:size(population,1)
locs = find(population(i,:)==1);
val = sum(data(:,locs),2); %#ok<FNDSB>
[corrsmat,pvals] = corrcoef(val,gb);
pval = pvals(1,2);
scores(i) = abs(corrsmat(1,2));
end

end

==============================

After one iteration, I get a 'subscripted assignment dimension mismatch'.

When I use a population of 99, the output of the creation function should be a 99x756 matrix of [1;0] right?
And that's the input of the fitness function, which should output an array of 99x1 fitness values right?

I don't know how to continue at the moment.. Who can help?

From: Daniel Armyr on 7 May 2010 11:18

> After one iteration, I get a 'subscripted assignment dimension mismatch'.
>
> When I use a population of 99, the output of the creation function should be a 99x756 matrix of [1;0] right?
> And that's the input of the fitness function, which should output an array of 99x1 fitness values right?
>
> I don't know how to continue at the moment.. Who can help?

Hi. I am not going to try to run your code, and I am not going to check it for errors, because I believe that will take me alot more time than it will take you who can actually run the code.

So, I am going to teach you how you find out what the problem is.

1) Run the program from the command window.
2) When matlab tells you the error (dimension missmatch), it will tell you a line number in link form that you can click on. Click on it to see what line it was.
3) Because it was a sssignment dimension missmatch error, the line will look like this (It may be much more complicated, but this is the essence of the line):
a(:) = b
Your a(:) may also be a a(i,:) or a(i) some other form, but it will be a part of a matrix.
your b will be any matrix.
4) Split the line into two. If you had the line above, you split it into the following two lines:
c = b;
a(:) = b;
Here, I am assuming that b is actually a mathematical formula or a function, and not just a simple matrix. If it was just a simple matrix, you can skip that step.
5) Place a breakpoint on the line a(:) = b and run the code from the console.
6) type disp(size(c)) and then disp(size(a:)) into the command window. The two answers will not be the same, but for matlab to work they have to be. (The exception is if c is a scalar, then it works too). Figure out why they are not the same and change your code untill they are the same.
7) Celebrate that you have fixed your bug.

OK, that was probably the most in-depth coaching I have done so far. Good luck.
--DA

From: Lennart van Luijk on 10 May 2010 03:03

Hey, thanks for your answer! But this is not a problem. When I run my separate functions from the commandline, everything works..
I know about matrices and this particular error. What I don't know, is why the genetic algorithm is generating this error while the two separate functions do work from the commandline.
My creation function generates a matrix of 200x756 zeros and ones, for a population size of 200.
Then the fitness function generates 200 fitness scores ranging from [0..1].

So, the functions seem to work properly but they don't work within the Genetic Algorithm! I don't understand why..

From: Daniel Armyr on 10 May 2010 03:38

> Hey, thanks for your answer! But this is not a problem. When I run my separate functions from the commandline, everything works..

That may be true, but when you run things from the command line, you are changing the code. Not by very much, but by a little. The only way to be sure is to run the code exactly like it will in the script and use breakpoints. Sometimes, even that is too much of a change, but not in matlab code as straight-forward as this.

Sincerely
Daniel Armyr

From: Lennart van Luijk on 10 May 2010 07:13

Hey, I've tried this but the error is only displayed in the 'optimtool' window where the Genetic Algorithm runs.

I display diagnostic progress messages and both functions run fine the 1st iteration. I even see the first generation's Best and Mean fitness value.

My guess is the problem resides in the functions that follow. Perhaps the Fitness scaling / Selection / Mutation functions?

How do I know in which function the problem lies? Because when I use 'rank' for fitness scaling, I don't know which m-file is used for this.

| Next | Last
Pages: 1 2
Prev: Steady State with SimBiology
Next: Contour recognition pattern