multiple boundary conditions for fmincon [Matlab]

Prev: Generating a parity check matrix in ldpc
Next: How to i get the wavelet coefficients of an image?

From: Matt J on 25 Jan 2010 17:08

"John D'Errico" <woodchips(a)rochester.rr.com> wrote in message <hjl2hd$8rb$1(a)fred.mathworks.com>...

> As was said before, fmincon cannot jump between
> discontinuous pieces of a domain, so you would
> still need to run fmincon multiple times.
==========

I don't believe that was ever really substantiated (see Message #7). The sizes of the jumps taken by fmincon are influenced by the Hessian. There's no way to know a priori how big the jumps will be, and hence no way to know if each domain will act as a capture basin. Also, you do realize, I'm sure, that fmincon algorithms can produce infeasible iterates, x. Since we know the x have the abilty to wander outside the constrained region, why is there reason to think it will get stuck in any particular sub-domain?

>
> It
> would force fmincon to form constraint derivatives
> when nothing of the sort is necessary, turning a
> set of simple bound constraints into the black box
> of nonlinear constraints.
>
=======

The constraint derivatives would have a pretty simple form. The extra computation in the derivatives may be outweighed by faster convergence, if you have fewer constraints to deal with. If you wanted, you could still apply bound constraints on the convex hull of the two disjoint intervals.

Anyway, all I'm ultimately saying is this: you obviously always better your odds of finding the global minimum by cutting the feasible region into more pieces and trying to optimize over each one. However, that's equally true of a problem where the feasible region is connected.

Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets.

From: Bruno Luong on 25 Jan 2010 18:09

"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjl4o3$qv$1(a)fred.mathworks.com>...

> Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets.

Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design.

Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence.

The Lagrange argument stated earlier is also approximate, since Lagrange multipliers is only non-zero for active constraint. In this case there is only one even when the constraints are formulated as combined as and "AND". Here the OR constraints must be treat separately, each with single constraint.

Finally FMINCON has separate the constraints in 3 categories: bound, linear and non-linear. Do you ever wonder why? Because the treatment is different, and very simple for the first to quite tricky for the last. When it possible, use the simplest one, that will ensure FMINCON to chose the best strategy.

Of course, FSOLVE can be use to solve linear-system, but that is silly. Quite similar suggestion has been made here.

Bruno

From: John D'Errico on 25 Jan 2010 18:29

"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjl4o3$qv$1(a)fred.mathworks.com>...
> "John D'Errico" <woodchips(a)rochester.rr.com> wrote in message <hjl2hd$8rb$1(a)fred.mathworks.com>...
>
> > As was said before, fmincon cannot jump between
> > discontinuous pieces of a domain, so you would
> > still need to run fmincon multiple times.
> ==========
>
> I don't believe that was ever really substantiated (see Message #7). The sizes of the jumps taken by fmincon are influenced by the Hessian. There's no way to know a priori how big the jumps will be, and hence no way to know if each domain will act as a capture basin. Also, you do realize, I'm sure, that fmincon algorithms can produce infeasible iterates, x. Since we know the x have the abilty to wander outside the constrained region, why is there reason to think it will get stuck in any particular sub-domain?
>

Don't be silly. Hoping that fmincon can make
a discontinuous jump on a prayer is a foolish
way to perform an optimization.

The methods used by fmincon simply do not
presume this class of constraints as a possibility.

> > It
> > would force fmincon to form constraint derivatives
> > when nothing of the sort is necessary, turning a
> > set of simple bound constraints into the black box
> > of nonlinear constraints.
> >
> =======
>
> The constraint derivatives would have a pretty simple form. The extra computation in the derivatives may be outweighed by faster convergence, if you have fewer constraints to deal with. If you wanted, you could still apply bound constraints on the convex hull of the two disjoint intervals.
>
> Anyway, all I'm ultimately saying is this: you obviously always better your odds of finding the global minimum by cutting the feasible region into more pieces and trying to optimize over each one. However, that's equally true of a problem where the feasible region is connected.
>
> Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets.

Starting with a point inside one subset of the
domain space, exactly how do you expect
fmincon to know that there is another part of
the domain space that will yield feasible results?

Magic is not an answer here, nor is clairvoyance.
Once fmincon finds itself inside a feasible part of
the domain space, it will not happily jump to
another.

Optimizers are simple things. They try to walk
downhill. When they bounce into a wall, they
try to feel their way along it. They cannot simply
guess to try a wildly different point that appears
to be outside of the nonlinear constraints that
you provide, in the expectation that you did
something foolish like this.

Try this, as a timing comparison:

xy0 = rand(1,2)*3 - 1
xy0 =
-0.89666 0.31623

opts = optimset('fmincon');
opts.Display = 'none';
opts.LargeScale = 'off';

% minimize -exp(-(x-x0)^2 - (y-y0)^2), subject to bound constraints,
% where x and y both must live in [0,1].
fun = @(xy) -exp(-sum((xy - xy0).^2));

tic,[res,fval,exitflag,output] = fmincon(fun,[.5 .5],[],[],[],[],[0 0],[1 1],[],opts);toc
res
output

Elapsed time is 0.042823 seconds.

res =
0 0.31622

output =
iterations: 4
funcCount: 15
lssteplength: 1
stepsize: 0.01174
algorithm: 'medium-scale: SQP, Quasi-Newton, line-search'
firstorderopt: 8.2701e-06
message: [1x172 char]

So using simple bounds it took .043 seconds
to return a result. Now try it using nonlinear
constraints to do the same thing.

nonlcon = @(xy) deal(xy.*(xy-1),[]);

tic,[res,fval,exitflag,output] = fmincon(fun,[.5 .5],[],[],[],[],[],[],nonlcon,opts);toc
res
output

Elapsed time is 0.067194 seconds.

res =
-9.0844e-15 0.31623

output =
iterations: 5
funcCount: 18
lssteplength: 1
stepsize: 0.00015521
algorithm: 'medium-scale: SQP, Quasi-Newton, line-search'
firstorderopt: 1.6913e-06
message: [1x172 char]

See that it took 56% longer to solve the same
problem. Nonlinear constraints take more work
than the simple bound constraints.

John

From: Matt J on 26 Jan 2010 11:14

"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hjl8ah$f9h$1(a)fred.mathworks.com>...

Hey Bruno,

> Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design.
>
> Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence.
========================

Gradient algorithms can certainly take large jumps if you don't manage to initialize them close to a stationary point. We usually don't like to rely on them jumping from one basin to another, which is why we like to choose good initializers.

However, that is not relevant to the situation at hand. As I said before, we have no reason to think, under my formulation of the constraints, that the two disjoint intervals in the OP's problem would occupy separate capture basins. fmincon does not need to satisfy the constraints at every iteration, so it can freely roam between the two disjoint sub-domains, even if it takes very small steps.

Remember, it's the Lagrangian that we're minimzing. If you parametrize the constraints using my proposed quartic polynomial, the function we're minimizing will still be the weighted sum of that quartic and the OP's objective function (whose form we still don't know, BTW). Who knows how that function looks in the interval [-7,7]? It could be perfectly convex for all we know.

Now, I've already acknowledged that it could be prudent to assume the Lagrangian is *not* convex and to cut up the problem domain into sub-domains. But that would be equally true if we didn't have 2 disjoint intervals on our hands. If the constraints were conventionally convex, but the objective function were multi-modal, you would have the exact same (and typical) problem.

> The Lagrange argument stated earlier is also approximate, since Lagrange multipliers is only non-zero for active constraint. In this case there is only one even when the constraints are formulated as combined as and "AND". Here the OR constraints must be treat separately, each with single constraint.
===================

But you don't know in advance which constraints will be active at the minimum, so you always have to iterate over all Lagrange multipliers.

> Finally FMINCON has separate the constraints in 3 categories: bound, linear and non-linear. Do you ever wonder why? Because the treatment is different, and very simple for the first to quite tricky for the last. When it possible, use the simplest one, that will ensure FMINCON to chose the best strategy.
==================

Again, it is possible in all problems, convexly constrained or otherwise, to cut up the domain into box-bounded regions, and to obtain sub-problems with simpler constraints. Did you ever wonder why people don't always do this? It simplifies the sub-problem, but not the overall one.

From: John D'Errico on 26 Jan 2010 11:59

"Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjn4cc$264$1(a)fred.mathworks.com>...
> "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hjl8ah$f9h$1(a)fred.mathworks.com>...
>
> Hey Bruno,
>
> > Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design.
> >
> > Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence.
> ========================
>
> Gradient algorithms can certainly take large jumps if you don't manage to initialize them close to a stationary point. We usually don't like to rely on them jumping from one basin to another, which is why we like to choose good initializers.
>

Again, your basic idea is to pray that you either
start in a good place, or that fmincon can somehow
get arbitrarily lucky and try a point outside of where
it thinks the constraints will be satisfied?

Your understanding of fmincon seems to be the
issue here. Prayer never seems to work as well for
me when I work with MATLAB, but maybe you have
a better connection. Perhaps more tests with fmincon
would help convince you.

fun = @(xy) -exp(-sum(xy.^2)/10);
nonlcon = @(xy) deal(1-cos(xy),[]);

xstart = 2*pi + randn(1,2)/5
% xstart = randn(1,2)/5

options = optimset('fmincon');
options.Display = 'iter';
options.LargeScale = 'off';

[res,fv,exitflag,output] = fmincon(fun,xstart,[],[],[],[],[],[],nonlcon,options)

See how this works, as long as you start in the
wrong part of the parameter space.

xstart =
6.1655 6.7198

res =
6.2832 6.282

fv =
-0.00037292

exitflag =
1

output =
iterations: 9
funcCount: 30
lssteplength: 1
stepsize: 0.0011968
algorithm: 'medium-scale: SQP, Quasi-Newton, line-search'
firstorderopt: 2.8026e-07
message: [1x143 char]

Fmincon does not escape this incorrect choice. It will
not in general ever try to do so, even for this trivially
easy function to minimize. Of course, had I allowed it
to start in the vicinity of [0,0], it does converge to zero
well enough.

John

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Generating a parity check matrix in ldpc
Next: How to i get the wavelet coefficients of an image?