Prev: Generating a parity check matrix in ldpc
Next: How to i get the wavelet coefficients of an image?
From: Matt J on 25 Jan 2010 17:08 "John D'Errico" <woodchips(a)rochester.rr.com> wrote in message <hjl2hd$8rb$1(a)fred.mathworks.com>... > As was said before, fmincon cannot jump between > discontinuous pieces of a domain, so you would > still need to run fmincon multiple times. ========== I don't believe that was ever really substantiated (see Message #7). The sizes of the jumps taken by fmincon are influenced by the Hessian. There's no way to know a priori how big the jumps will be, and hence no way to know if each domain will act as a capture basin. Also, you do realize, I'm sure, that fmincon algorithms can produce infeasible iterates, x. Since we know the x have the abilty to wander outside the constrained region, why is there reason to think it will get stuck in any particular sub-domain? > > It > would force fmincon to form constraint derivatives > when nothing of the sort is necessary, turning a > set of simple bound constraints into the black box > of nonlinear constraints. > ======= The constraint derivatives would have a pretty simple form. The extra computation in the derivatives may be outweighed by faster convergence, if you have fewer constraints to deal with. If you wanted, you could still apply bound constraints on the convex hull of the two disjoint intervals. Anyway, all I'm ultimately saying is this: you obviously always better your odds of finding the global minimum by cutting the feasible region into more pieces and trying to optimize over each one. However, that's equally true of a problem where the feasible region is connected. Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets.
From: Bruno Luong on 25 Jan 2010 18:09 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjl4o3$qv$1(a)fred.mathworks.com>... > Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets. Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design. Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence. The Lagrange argument stated earlier is also approximate, since Lagrange multipliers is only non-zero for active constraint. In this case there is only one even when the constraints are formulated as combined as and "AND". Here the OR constraints must be treat separately, each with single constraint. Finally FMINCON has separate the constraints in 3 categories: bound, linear and non-linear. Do you ever wonder why? Because the treatment is different, and very simple for the first to quite tricky for the last. When it possible, use the simplest one, that will ensure FMINCON to chose the best strategy. Of course, FSOLVE can be use to solve linear-system, but that is silly. Quite similar suggestion has been made here. Bruno
From: John D'Errico on 25 Jan 2010 18:29 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjl4o3$qv$1(a)fred.mathworks.com>... > "John D'Errico" <woodchips(a)rochester.rr.com> wrote in message <hjl2hd$8rb$1(a)fred.mathworks.com>... > > > As was said before, fmincon cannot jump between > > discontinuous pieces of a domain, so you would > > still need to run fmincon multiple times. > ========== > > I don't believe that was ever really substantiated (see Message #7). The sizes of the jumps taken by fmincon are influenced by the Hessian. There's no way to know a priori how big the jumps will be, and hence no way to know if each domain will act as a capture basin. Also, you do realize, I'm sure, that fmincon algorithms can produce infeasible iterates, x. Since we know the x have the abilty to wander outside the constrained region, why is there reason to think it will get stuck in any particular sub-domain? > Don't be silly. Hoping that fmincon can make a discontinuous jump on a prayer is a foolish way to perform an optimization. The methods used by fmincon simply do not presume this class of constraints as a possibility. > > It > > would force fmincon to form constraint derivatives > > when nothing of the sort is necessary, turning a > > set of simple bound constraints into the black box > > of nonlinear constraints. > > > ======= > > The constraint derivatives would have a pretty simple form. The extra computation in the derivatives may be outweighed by faster convergence, if you have fewer constraints to deal with. If you wanted, you could still apply bound constraints on the convex hull of the two disjoint intervals. > > Anyway, all I'm ultimately saying is this: you obviously always better your odds of finding the global minimum by cutting the feasible region into more pieces and trying to optimize over each one. However, that's equally true of a problem where the feasible region is connected. > > Unless you can clarify for me why fmincon can't jump between two disconnected sub-domains, I can't see why there's any more of a compelling reason to use it on problems with disconnected feasible sets. Starting with a point inside one subset of the domain space, exactly how do you expect fmincon to know that there is another part of the domain space that will yield feasible results? Magic is not an answer here, nor is clairvoyance. Once fmincon finds itself inside a feasible part of the domain space, it will not happily jump to another. Optimizers are simple things. They try to walk downhill. When they bounce into a wall, they try to feel their way along it. They cannot simply guess to try a wildly different point that appears to be outside of the nonlinear constraints that you provide, in the expectation that you did something foolish like this. Try this, as a timing comparison: xy0 = rand(1,2)*3 - 1 xy0 = -0.89666 0.31623 opts = optimset('fmincon'); opts.Display = 'none'; opts.LargeScale = 'off'; % minimize -exp(-(x-x0)^2 - (y-y0)^2), subject to bound constraints, % where x and y both must live in [0,1]. fun = @(xy) -exp(-sum((xy - xy0).^2)); tic,[res,fval,exitflag,output] = fmincon(fun,[.5 .5],[],[],[],[],[0 0],[1 1],[],opts);toc res output Elapsed time is 0.042823 seconds. res = 0 0.31622 output = iterations: 4 funcCount: 15 lssteplength: 1 stepsize: 0.01174 algorithm: 'medium-scale: SQP, Quasi-Newton, line-search' firstorderopt: 8.2701e-06 message: [1x172 char] So using simple bounds it took .043 seconds to return a result. Now try it using nonlinear constraints to do the same thing. nonlcon = @(xy) deal(xy.*(xy-1),[]); tic,[res,fval,exitflag,output] = fmincon(fun,[.5 .5],[],[],[],[],[],[],nonlcon,opts);toc res output Elapsed time is 0.067194 seconds. res = -9.0844e-15 0.31623 output = iterations: 5 funcCount: 18 lssteplength: 1 stepsize: 0.00015521 algorithm: 'medium-scale: SQP, Quasi-Newton, line-search' firstorderopt: 1.6913e-06 message: [1x172 char] See that it took 56% longer to solve the same problem. Nonlinear constraints take more work than the simple bound constraints. John
From: Matt J on 26 Jan 2010 11:14 "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hjl8ah$f9h$1(a)fred.mathworks.com>... Hey Bruno, > Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design. > > Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence. ======================== Gradient algorithms can certainly take large jumps if you don't manage to initialize them close to a stationary point. We usually don't like to rely on them jumping from one basin to another, which is why we like to choose good initializers. However, that is not relevant to the situation at hand. As I said before, we have no reason to think, under my formulation of the constraints, that the two disjoint intervals in the OP's problem would occupy separate capture basins. fmincon does not need to satisfy the constraints at every iteration, so it can freely roam between the two disjoint sub-domains, even if it takes very small steps. Remember, it's the Lagrangian that we're minimzing. If you parametrize the constraints using my proposed quartic polynomial, the function we're minimizing will still be the weighted sum of that quartic and the OP's objective function (whose form we still don't know, BTW). Who knows how that function looks in the interval [-7,7]? It could be perfectly convex for all we know. Now, I've already acknowledged that it could be prudent to assume the Lagrangian is *not* convex and to cut up the problem domain into sub-domains. But that would be equally true if we didn't have 2 disjoint intervals on our hands. If the constraints were conventionally convex, but the objective function were multi-modal, you would have the exact same (and typical) problem. > The Lagrange argument stated earlier is also approximate, since Lagrange multipliers is only non-zero for active constraint. In this case there is only one even when the constraints are formulated as combined as and "AND". Here the OR constraints must be treat separately, each with single constraint. =================== But you don't know in advance which constraints will be active at the minimum, so you always have to iterate over all Lagrange multipliers. > Finally FMINCON has separate the constraints in 3 categories: bound, linear and non-linear. Do you ever wonder why? Because the treatment is different, and very simple for the first to quite tricky for the last. When it possible, use the simplest one, that will ensure FMINCON to chose the best strategy. ================== Again, it is possible in all problems, convexly constrained or otherwise, to cut up the domain into box-bounded regions, and to obtain sub-problems with simpler constraints. Did you ever wonder why people don't always do this? It simplifies the sub-problem, but not the overall one.
From: John D'Errico on 26 Jan 2010 11:59 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hjn4cc$264$1(a)fred.mathworks.com>... > "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hjl8ah$f9h$1(a)fred.mathworks.com>... > > Hey Bruno, > > > Gradient-based methods (e.g., fmincon) seldom jump. When it does is during the linesearch and it does by accident rather than by design. > > > > Otherwise (if it can jump) no one would care about selection a good starting point to ensure good convergence. > ======================== > > Gradient algorithms can certainly take large jumps if you don't manage to initialize them close to a stationary point. We usually don't like to rely on them jumping from one basin to another, which is why we like to choose good initializers. > Again, your basic idea is to pray that you either start in a good place, or that fmincon can somehow get arbitrarily lucky and try a point outside of where it thinks the constraints will be satisfied? Your understanding of fmincon seems to be the issue here. Prayer never seems to work as well for me when I work with MATLAB, but maybe you have a better connection. Perhaps more tests with fmincon would help convince you. fun = @(xy) -exp(-sum(xy.^2)/10); nonlcon = @(xy) deal(1-cos(xy),[]); xstart = 2*pi + randn(1,2)/5 % xstart = randn(1,2)/5 options = optimset('fmincon'); options.Display = 'iter'; options.LargeScale = 'off'; [res,fv,exitflag,output] = fmincon(fun,xstart,[],[],[],[],[],[],nonlcon,options) See how this works, as long as you start in the wrong part of the parameter space. xstart = 6.1655 6.7198 res = 6.2832 6.282 fv = -0.00037292 exitflag = 1 output = iterations: 9 funcCount: 30 lssteplength: 1 stepsize: 0.0011968 algorithm: 'medium-scale: SQP, Quasi-Newton, line-search' firstorderopt: 2.8026e-07 message: [1x143 char] Fmincon does not escape this incorrect choice. It will not in general ever try to do so, even for this trivially easy function to minimize. Of course, had I allowed it to start in the vicinity of [0,0], it does converge to zero well enough. John
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: Generating a parity check matrix in ldpc Next: How to i get the wavelet coefficients of an image? |