From: Luca Cerone on
Dear all,
I have developed a small ODE model (3 equations) depending on 6 parameters, (say a,b,c etc all of them have to be >0 and <1) and I would like to find parameters for the solution that best approximates some experimental (biological) data I've collected.
I know that it should be an easy and straightforward task using lsqcurvefit, fminsearch
or other function from the optimization toolbox, but I'd like to have some advices on the best strategies so that I can avoid common problems, like the one about tolerances, number of steps and so on.
First of all, which objective function would you use? As the data can go from low value to high value, I think that "mean of relative squared errors" would be more appropriate to use.
Also, I found a nice non-dimentional version of the ODE and the number of parameter reduce to 3. Is there any way I can use this kind of information?

Also data are obtained from experiment repeated at different time points.
Would you regard all the points in a unique curve?
Would you find parameters for each set of experiment and then average post/process it?

Can you please guide me a little bit in this process?
I've done it a few times with toys data and example, but now that I have to use real
data the process doesn't seem so easy :)
All kind of advices are really appreciated.
Cheers, -Luca
From: Arthur Goldsipe on
Hi Luca,

I don't think there's one definitive answer for how you should approach the problem. What exactly is your goal? What do you want to do with the model once you fit the parameters?

Even so, I think I can give you a little bit of advice. If you can reduce the number of parameters by non-dimensionalizing the model, then it probably means some of the parameters in the original problem are not identifiable. In that case, you should perform fitting using the non-dimensionalized model.

Another issue when fitting ODE models is that you must take care when estimating the gradient or Jacobian of your objective function. The default perturbations to parameter values are usually too small to provide good estimates of where the optimizer should step. In this case, you typically need to increase the size of the optimization parameter DiffMinChange. Even better, if your ODE is simple enough, you could probably provide analytical forms of the required derivatives (see the GradObj or Jacobian options).

Another common issue when fitting noisy biological data is that the presence of many locally optimum parameter values. This is especially important if you are not very sure of your initial parameter values. In this case, you can use one of the optimization functions from the Global Optimization toolbox to try to find the global optimum. Personally, I often start with the genetic algorithm function (ga) when I need to look at a large parameter space.

In terms of the objective function, it sounds like you're trying to figure out how to weight the various data points. Do you have some idea of the error for each data point? If you estimate that the error (or uncertainty) is roughly constant for all points, then your objective function can probably just be the mean square error. However, if the error is very different across data points (for example, proportional to the data values), then you need to use a weighted mean square error. Typically, the weight is 1/sigma, where sigma is the standard deviation of the error model. So, if your error model is proportional, you could write your objective function as follows:

objective = sum( ((y_predicted - y_measured)./y_measured).^2 );

Note that the above function is equivalent to:

objective = sum((1 - y_predicted./y_measured).^2);

I hope that's enough to get you started. Good luck!
--Arthur

"Luca Cerone" <luca_cerone#_remove_this#@yahoo.it> wrote in message <hvq9td$4dc$1(a)fred.mathworks.com>...
> Dear all,
> I have developed a small ODE model (3 equations) depending on 6 parameters, (say a,b,c etc all of them have to be >0 and <1) and I would like to find parameters for the solution that best approximates some experimental (biological) data I've collected.
> I know that it should be an easy and straightforward task using lsqcurvefit, fminsearch
> or other function from the optimization toolbox, but I'd like to have some advices on the best strategies so that I can avoid common problems, like the one about tolerances, number of steps and so on.
> First of all, which objective function would you use? As the data can go from low value to high value, I think that "mean of relative squared errors" would be more appropriate to use.
> Also, I found a nice non-dimentional version of the ODE and the number of parameter reduce to 3. Is there any way I can use this kind of information?
>
> Also data are obtained from experiment repeated at different time points.
> Would you regard all the points in a unique curve?
> Would you find parameters for each set of experiment and then average post/process it?
>
> Can you please guide me a little bit in this process?
> I've done it a few times with toys data and example, but now that I have to use real
> data the process doesn't seem so easy :)
> All kind of advices are really appreciated.
> Cheers, -Luca
From: Luca Cerone on
Hi Arthur, thanks for all the advices.
>What do you want to do with the model once you fit the parameters?
I'd like my model to fit well the experimental data, and hopefully make unexpected prediction on the behaviour of the system

> In that case, you should perform fitting using the non-dimensionalized model.
I thought that as well. But as the re-scaled variable depend on some of the unknown parameters, how would I have to treat the experimental data???

> Another issue when fitting ODE models is that you must take care when >estimating the gradient or Jacobian of your objective function.
Thanks for the advice I have analytical form for the Jacobian, will try to use it.

> Personally, I often start with the genetic algorithm function (ga) when I need >to look at a large parameter space.
I tried, but could never succeed in setting properly the constraints!

> So, if your error model is proportional, you could write your objective function >as follows: objective = sum((1 - y_predicted./y_measured).^2);
Thanks again.

Just an other question, as for every set of parameter Matlab has to solve an ODE system, is there anything I can do to speed up the process a little bit?
Some smart way of organizing variables, pass parameters and so on...
(especially for the genetic algorithm minimization this would be extremely helpful).
From: Arthur Goldsipe on
> > In that case, you should perform fitting using the non-dimensionalized model.
> I thought that as well. But as the re-scaled variable depend on some of the unknown
> parameters, how would I have to treat the experimental data???

In order to do the fit, you do need to be able to compare your simulation to the experimental data. So after a simulation you should convert the non-dimensionalized results back into the appropriate units using the scaling parameters. So your objective function will need to do both the simulation and the rescaling.

> > Personally, I often start with the genetic algorithm function (ga) when I need
> > to look at a large parameter space.
> I tried, but could never succeed in setting properly the constraints!

There are a lot of options. But I think it's worthwhile to spend some time reading the documentation to learn how to use the function. If you still can't figure it out, then post some more details about the specific problems you're having (perhaps with a more specific subject line), and someone will probably be able to help you.

> Just an other question, as for every set of parameter Matlab has to solve an ODE
> system, is there anything I can do to speed up the process a little bit?
> Some smart way of organizing variables, pass parameters and so on...
> (especially for the genetic algorithm minimization this would be extremely helpful).

First, have you checked to see whether you can get an analytical solution to your (non-dimensionalized) ODEs? Try using the Symbolic Toolbox (if you have access to it) to solve the equations. An analytical solution would certainly be the most efficient option.

If you cannot obtain an analytical solution, choose an appropriate ODE solver. Biological ODEs are often stiff, so ode15s would be my initial guess for the best solver. You can also dramatically improve performance by providing a Jacobian function to the ODE solver. Lastly, choose your relative and absolute tolerances appropriately. You should probably keep the default relative tolerance of 1e-3. An appropriate absolute tolerance depends on the scale of the state variables over the course of the simulation. The default value is 1e-6, but you should set the value to whatever value is "essentially zero" for your variables (but remember that we are talking about the non-dimensionalized variable).

Good luck!
-Arthur
From: Luca Cerone on
> So after a simulation you should convert the non-dimensionalized results back >into the appropriate units using the scaling parameters. So your objective >function will need to do both the simulation and the rescaling.
Actually I'm looking for the parameters, and also measurements have not a unit so the rescaling is not possible if I work with the dimentionless model.

Thanks again,
-Luca