ANN Training with PSO [Matlab]

Prev: Mac 10.6.3 Matlab r2010
Next: Passing structure of pointers as an argument

From: Trish R on 19 May 2010 11:53

Hi George,

Your post here was helpful to me thank you. I have a question. You stated,
"Fifthly, interface the PSO toolbox and ANN code by creating a new objective function for the toolbox"
Can you please clarify the use of objective functions with ANN and PSO. Thank you.

"George " <george(a)georgeevers.org> wrote in message <hp68bb$j0t$1(a)fred.mathworks.com>...
> "Mohammed Ibrahim" <hammudy20(a)yahoo.com> wrote in message <hp5au2$3iu$1(a)fred.mathworks.com>...
> > "Burak " <newsreader(a)mathworks.com> wrote in message <f9sagf$dfn$1(a)fred.mathworks.com>...
> > > Hi there,
> > >
> > > I am a graduate student working on particle swarm
> > > optimization. I wanna to learn more about ANN training with
> > > PSO. Although there is a good PSO toolbox release, it seems
> > > complicated as I observe the source code for neural network
> > > training. There are some articles about this issue, but it
> > > is not clear how they implement PSO to ANN training
> > > Thanks for your answers and help
> > >
> > > Burak
>
> Burak, to train an ANN using PSO, firstly, identify a well-performing ANN for your application. Find characteristics that seem to work well for problems similar to yours: e.g. novel concepts, number of hidden layers, number of inputs, and types of inputs. Keep a detailed bibliography and save all relevant papers. Though you will train with PSO, you should keep notes of other good training algorithms with which to compare in order to fully demonstrate the validity of PSO as a training mechanism.
>
> Secondly, find a PSO type suitable to your application. For example, RegPSO [1: Chapters 4-6], [2] does a great job of escaping from premature convergence in order to find increasingly better solutions when there is time to regroup the swarm, which would seem to be the case for training ANN's in general.
>
> Thirdly, locate a good PSO toolbox - preferably one that is already capable of implementing the strain of PSO you would like to use. Ideally, the toolbox would contain standard gbest and lbest PSO's as well as the more evolved PSO type found in step two. If the variation of PSO you would like to use is not available in a suitable toolbox, locate a powerful toolbox, and contribute the code. The PSO toolbox doesn't need to have code for training ANN's since you can locate solid code for implementing ANN's and simply interface the best of both worlds.
>
> Fourthly, locate a good ANN code to interface with the toolbox - preferably written in the same langauge. As long as you can implement the ANN with code alone (e.g. as with MATLAB's neural net toolbox) rather than necessarily depending on a GUI, the two can be interfaced.
>
> Fifthly, interface the PSO toolbox and ANN code by creating a new objective function for the toolbox. If you use the "PSO Research Toolbox," just create a new file called "benchmark_BurakANN.m" using the pseudo code below to interface the two codes:
> function [f] = benchmark_BurakANN(x, np)
> global dim
> f = zeros(np, 1);
> for Internal_j = 1:np
> f(Internal_j, 1) = the result of passing x(Internal_j, :) into your ANN code
> end
>
> What makes sense to me is that each function value in column vector "f" would reflect the error (e.g. the difference or biased difference) between the ANN's prediction and the actually desired function value since it is the error that you want to minimize. To be more in line with real-world applications, you could translate each error into financial cost and minimize that value.
>
> FYI, the problem dimensionality will be equal to the number of ANN parameters that you wish to optimize so that each dimension represents one decision variable of the network to be optimized. For example, will you keep the number of hidden layers constant or include this as one dimension to be optimized? Read up on the most recent ANN literature: I could be wrong, but it is my impression that while more complex ANN's (e.g. those with more hidden layers) might be capable of solving more complicated problems, they also tend to memorize the data more quickly, which is a big problem since the goal is not memorization but prediction. I personally would leave the number of hidden layers constant at whatever seems to have worked best in the literature and possibly experiment with changing it at a later time.
>
> Happy Researching!
>
> Sincerely,
> George I. Evers
>
> [1] http://www.georgeevers.org/thesis.pdf
> [2] http://www.georgeevers.org/Evers_BenGhalia_IEEEConfSMC09.pdf

From: George on 19 May 2010 18:14

> Your post here was helpful to me thank you. I have a question. You stated,
> "Fifthly, interface the PSO toolbox and ANN code by creating a new objective function for the toolbox"
> Can you please clarify the use of objective functions with ANN and PSO. Thank you.

Tricia,

You are welcome. Because this post is long, I created headings using capital letters. These don't mean I'm yelling at you though, lol.

* CREATION OF NEW "FUNCTION"
I used the PSO Research Toolbox to combat premature convergence across a suite of popular benchmark functions. To train an ANN instead, I suggest writing a "function" (i.e. in the programming sense of the word) as per step "Fifthly" in the 3 Apr, 2010 posting above in order to interface the toolbox with the ANN. This function will pass: (i) the matrix of position/row vectors from the toolbox to the ANN, and (ii) the column vector of function values to be minimized from the ANN to the toolbox. See the comments atop existing benchmark files for further clarification.

Caution: Be sure to properly define the function value that you will be minimizing. For example, if you define error as f = predicted_f - historical_f, this would produce negative values when the historical value is less than the ANN's predicted value, so the optimization process would not minimize the error since all negative error values would appear more attractive than the ideal error value of zero that occurs when predicted_f = historical_f. A simple solution would be to minimize the absolute value of the error, which creates a true minimum function value of zero where the actual and predicted values are identical. But this might not be practical for some applications where erring on one side could be more problematic than erring on the other side, in which case you might need to add a penalty based on the sign of the error. For example, if I use an ANN to predict the 52-week
maximum price of a stock in order to predict the ideal selling price for the year, an error of even one nickel could be devastating if it overestimates my selling price since that could mean not being able to effect the trade (though realistically, I sold MDT above its "52-week maximum" since it is not a true mathematical maximum but appears to be nothing more than the maximum of the daily closing prices). Whereas, selling for a nickel less than the predicted 52-week maximum price doesn't hurt me unless I'm trading penny stocks. So I suggest giving some consideration to the most intelligent way to define the error you would like to minimize since that is crucial to giving your problem real-world meaning.

* MODIFICATION OF BENCHMARKS.M
Once you have created the new function, copy the code below into "Benchmarks.m" to complete the interface between the PSO Research Toolbox and MATLAB's neural net toolbox. Internally, programming variables will refer to each function as a "benchmark": I have used the name "BurakANN" below for consistency with the pseudo code of the 3 Apr, 2010 post.

elseif benchmark_id == 11
benchmark = 'BurakANN';
range_ss0 = input a scalar or row vector here
center_ss0 = input a scalar or row vector here

To illustrate the use of scalar inputs, the Rastrigin benchmark searches [-5.12, 5.12] per dimension. Since the search space is the same per dimension, you can simply input scalar values. In this case, range_ss0 = 5.12 - (-5.12) = 2*5.12 = 10.24, and center_ss0 = 0.

To illustrate the use of vector inputs, suppose you want to search between -5.12 and 5.12 on the first dimension (i.e. with a range of 10.24 centered at 0) and between 0 and 40 on the second dimension (i.e. with a range of 40 centered at 20). In this case, you could set range_ss0 = [10.24, 40] and center_ss0 = [0, 20]. This functionality was added to enable the PSO Research Toolbox to handle real-world application problems that might involve different feasible values per dimension. For example, if one dimension to be optimized stores the number of nodes in a particular ANN layer, multiple dimensions store the values of the weights, and perhaps another dimension stores the number of hidden layers, vector inputs allow you to define a reasonable center and range for each dimension. Velocities in this case are clamped to percentage "vmax_perc" of the range on each dimension.

* SELECTION OF NEW FUNCTION IN CONTROL_PANEL.M
Be sure to set "benchmark_id = 11" near the end of "Control_Panel.m," which tells the PSO Research Toolbox to optimize the function to which id 11 is assigned in "Benchmarks.m" (i.e. your new function).

Cordially,
George I. Evers

From: Tricia Rambharose on 24 May 2010 18:04

Hi George,

Very very informative. Thank you!

However the interface is still somewhat unclear to me. I agree that the optimization in this case is minimizing the error between the ANN output and the target value(s). Using older PSO toolboxes the interface with Matlab is done simply when the network is created by specifying for e.g 'trainpso' as the training function in the call to Matlab's newff function. If using your PSO Research Toolbox, what exact call has to be made from a main program to link PSO to an ANN programmed in Matlab. Your explanations before explained creation of a new benchmark function but no mention was made of modifications or parameter settings for Matlab's ANN toolbox existing functions to link PSO with ANN. Or maybe I did not fully understand your explanation. Can you further clarify? Thank you.

"George " <george(a)georgeevers.org> wrote in message <ht1nrc$3cf$1(a)fred.mathworks.com>...
> > Your post here was helpful to me thank you. I have a question. You stated,
> > "Fifthly, interface the PSO toolbox and ANN code by creating a new objective function for the toolbox"
> > Can you please clarify the use of objective functions with ANN and PSO. Thank you.
>
> Tricia,
>
> You are welcome. Because this post is long, I created headings using capital letters. These don't mean I'm yelling at you though, lol.
>
> * CREATION OF NEW "FUNCTION"
> I used the PSO Research Toolbox to combat premature convergence across a suite of popular benchmark functions. To train an ANN instead, I suggest writing a "function" (i.e. in the programming sense of the word) as per step "Fifthly" in the 3 Apr, 2010 posting above in order to interface the toolbox with the ANN. This function will pass: (i) the matrix of position/row vectors from the toolbox to the ANN, and (ii) the column vector of function values to be minimized from the ANN to the toolbox. See the comments atop existing benchmark files for further clarification.
>
> Caution: Be sure to properly define the function value that you will be minimizing. For example, if you define error as f = predicted_f - historical_f, this would produce negative values when the historical value is less than the ANN's predicted value, so the optimization process would not minimize the error since all negative error values would appear more attractive than the ideal error value of zero that occurs when predicted_f = historical_f. A simple solution would be to minimize the absolute value of the error, which creates a true minimum function value of zero where the actual and predicted values are identical. But this might not be practical for some applications where erring on one side could be more problematic than erring on the other side, in which case you might need to add a penalty based on the sign of the error. For example, if I use an ANN to predict the 52-week
> maximum price of a stock in order to predict the ideal selling price for the year, an error of even one nickel could be devastating if it overestimates my selling price since that could mean not being able to effect the trade (though realistically, I sold MDT above its "52-week maximum" since it is not a true mathematical maximum but appears to be nothing more than the maximum of the daily closing prices). Whereas, selling for a nickel less than the predicted 52-week maximum price doesn't hurt me unless I'm trading penny stocks. So I suggest giving some consideration to the most intelligent way to define the error you would like to minimize since that is crucial to giving your problem real-world meaning.
>
> * MODIFICATION OF BENCHMARKS.M
> Once you have created the new function, copy the code below into "Benchmarks.m" to complete the interface between the PSO Research Toolbox and MATLAB's neural net toolbox. Internally, programming variables will refer to each function as a "benchmark": I have used the name "BurakANN" below for consistency with the pseudo code of the 3 Apr, 2010 post.
>
> elseif benchmark_id == 11
> benchmark = 'BurakANN';
> range_ss0 = input a scalar or row vector here
> center_ss0 = input a scalar or row vector here
>
> To illustrate the use of scalar inputs, the Rastrigin benchmark searches [-5.12, 5.12] per dimension. Since the search space is the same per dimension, you can simply input scalar values. In this case, range_ss0 = 5.12 - (-5.12) = 2*5.12 = 10.24, and center_ss0 = 0.
>
> To illustrate the use of vector inputs, suppose you want to search between -5.12 and 5.12 on the first dimension (i.e. with a range of 10.24 centered at 0) and between 0 and 40 on the second dimension (i.e. with a range of 40 centered at 20). In this case, you could set range_ss0 = [10.24, 40] and center_ss0 = [0, 20]. This functionality was added to enable the PSO Research Toolbox to handle real-world application problems that might involve different feasible values per dimension. For example, if one dimension to be optimized stores the number of nodes in a particular ANN layer, multiple dimensions store the values of the weights, and perhaps another dimension stores the number of hidden layers, vector inputs allow you to define a reasonable center and range for each dimension. Velocities in this case are clamped to percentage "vmax_perc" of the range on each dimension.
>
> * SELECTION OF NEW FUNCTION IN CONTROL_PANEL.M
> Be sure to set "benchmark_id = 11" near the end of "Control_Panel.m," which tells the PSO Research Toolbox to optimize the function to which id 11 is assigned in "Benchmarks.m" (i.e. your new function).
>
> Cordially,
> George I. Evers

|
Pages: 1
Prev: Mac 10.6.3 Matlab r2010
Next: Passing structure of pointers as an argument