Ridge regression for polynomial fitting [Matlab]

Prev: What is the problem with my equalizer?
Next: change diagonal to 1

From: J B on 27 Jun 2010 14:08

Hello,

I wish to use ridge regression to learn a mapping between data that is
of 100 dimensions and another that is of 22 dimensions. Before I can
do this, I need to gain a better understanding of ridge regression and
how it is used within matlab, so I am trying to use ridge regression
in the simple case of polynomial fitting.

In my test example, I am using the function y = sin(2*pi*x), where x
is 11 points evenly spaced in the range [0.0, 1.0] (incrementing from
0.0 by 0.1).

Given the help manual, the ridge function is defined as follows: B1 =
ridge(Y, X, K). However, the manual goes on to state that for making
predictions, B0 = ridge(Y, X, K, 0) is better suited.

Firstly, given the above function and values of x, is it correct that
my input variables should be defined as follows:

Y =
0
0.5878
0.9511
0.9511
0.5878
0.0000
-0.5878
-0.9511
-0.9511
-0.5878
-0.0000

X =
0 0 0 0
0.1000 0.0100 0.0010 0.0001
0.2000 0.0400 0.0080 0.0016
0.3000 0.0900 0.0270 0.0081
0.4000 0.1600 0.0640 0.0256
0.5000 0.2500 0.1250 0.0625
0.6000 0.3600 0.2160 0.1296
0.7000 0.4900 0.3430 0.2401
0.8000 0.6400 0.5120 0.4096
0.9000 0.8100 0.7290 0.6561
1.0000 1.0000 1.0000 1.0000

K = 0
(I believe this will result in standard least squares polynomial
fitting?)

Which results in B0 =
3.5567
-10.9474
7.1764
-0.0000

Given the polynomial w0 + w1*x + w2*x^2 + ... are these coefficients
starting at w0 or w1?

Plotting the resulting polynomial indicates that this is incorrect.
Can someone please point out what I have misunderstood.

Thanks!

From: John D'Errico on 27 Jun 2010 14:18

J B <trifinite84(a)googlemail.com> wrote in message <e1e9378c-dc91-4381-b967-e8c027284bbe(a)d16g2000yqb.googlegroups.com>...
> Hello,
>
> I wish to use ridge regression to learn a mapping between data that is
> of 100 dimensions and another that is of 22 dimensions.

This is an utterly, completely, absolutely, foolish task.

Unless perhaps the model will be a first order model, so
the model would have only 101 terms in it. But even a
full quadratic polynomial model for 100 dimensional data
will have roughly 5,000 coefficients to estimate.

I find it amazing the foolish things attempted by people
who have no concept at all of modeling. Throwing ridge
regression at the problem cannot make a silk purse from
a sow's ear.

John

From: J B on 27 Jun 2010 15:40

On Jun 27, 8:18 pm, "John D'Errico" <woodch...(a)rochester.rr.com>
wrote:
> J B <trifinit...(a)googlemail.com> wrote in message <e1e9378c-dc91-4381-b967-e8c027284...(a)d16g2000yqb.googlegroups.com>...
> > Hello,
>
> > I wish to use ridge regression to learn a mapping between data that is
> > of 100 dimensions and another that is of 22 dimensions.
>
> This is an utterly, completely, absolutely, foolish task.
>
> Unless perhaps the model will be a first order model, so
> the model would have only 101 terms in it. But even a
> full quadratic polynomial model for 100 dimensional data
> will have roughly 5,000 coefficients to estimate.
>
> I find it amazing the foolish things attempted by people
> who have no concept at all of modeling. Throwing ridge
> regression at the problem cannot make a silk purse from
> a sow's ear.
>
> John

Is someone able to provide some practicable advice in relation to my
questions regarding polynomial fitting with ridge regression?

John, maybe I didn't make myself abundantly clear I am no expert in
this matter; far from it. I have posted a genuine question that I am
hoping to have either answered or critiqued. I welcome criticism but
your reply is clearly an attempt to offend rather than inform.

From: John D'Errico on 27 Jun 2010 16:56

J B <trifinite84(a)googlemail.com> wrote in message <d5bd8af4-d168-41fb-923b-1487ccf8be08(a)u26g2000yqu.googlegroups.com>...
> On Jun 27, 8:18 pm, "John D'Errico" <woodch...(a)rochester.rr.com>
> wrote:
> > J B <trifinit...(a)googlemail.com> wrote in message <e1e9378c-dc91-4381-b967-e8c027284...(a)d16g2000yqb.googlegroups.com>...
> > > Hello,
> >
> > > I wish to use ridge regression to learn a mapping between data that is
> > > of 100 dimensions and another that is of 22 dimensions.
> >
> > This is an utterly, completely, absolutely, foolish task.
> >
> > Unless perhaps the model will be a first order model, so
> > the model would have only 101 terms in it. But even a
> > full quadratic polynomial model for 100 dimensional data
> > will have roughly 5,000 coefficients to estimate.
> >
> > I find it amazing the foolish things attempted by people
> > who have no concept at all of modeling. Throwing ridge
> > regression at the problem cannot make a silk purse from
> > a sow's ear.
> >
> > John
>
> Is someone able to provide some practicable advice in relation to my
> questions regarding polynomial fitting with ridge regression?
>
> John, maybe I didn't make myself abundantly clear — I am no expert in
> this matter; far from it. I have posted a genuine question that I am
> hoping to have either answered or critiqued. I welcome criticism but
> your reply is clearly an attempt to offend rather than inform.

No. I am merely trying to tell you that this is a hopeless
task. If nobody ever tells you that, then you will continue
on this foolish quest, and then wonder what it is that you
are doing wrong.

I'm not trying to offend you. But the fact is, it is obvious
that you do not understand the subject. Suppose I decided
to attempt to fly a homebuilt rocket to Mars? I'm no expert
in the area, so it might be a good idea for someone at some
time to tell me that my goal is a foolish one, especially if I
plan on riding that home-built rocket myself.

You don't say what order model you contemplate using.
But you show a 4th order model. So do you realize that
say a 4th order polynomial model on a 100 dimensional
space requires the estimation of

nchoosek(100,4)
ans =
3921225

terms? And those are only the 4th order terms. Don't leave
out the 161700 cubic terms, the 4950 quadratic terms, the
100 linear terms, and a constant in that model. Do you have
enough data to justify a model this size? Do you understand
how much data you need to estimate that model? I don't
think so. Do you have adequate data to have a chance of
estimating that model?

You clearly don't understand ridge regression, or why/when
it is appropriate to use. But a ridge regression will not help
you to solve this problem. Just because the you can form a
non-singular matrix does not suggest the coefficients from
that ridge estimator will mean anything at all.

Use of a ridge estimator to bias the coefficients of a model
towards zero does exactly that. It simply biases the coefficients
towards zero. But there is no reason to presume that you will
know how much bias to apply.

You asked for a critique. This is it. Before you try to use a
ridge regression, why not learn how to do a simple regression
in the first place? Learn how to build a model with a constant
term in the model! Learn how to do this regression modeling
using backslash, rather than just throwing the statistics toolbox
at it, without even understanding the basics of regression.

Learn how much data you should expect to need for a model.
Learn what good data looks like. Learn how a perfectly good
model can be corrupted by something as simple as poor
scaling of your data. Learn what high leverage points are.
Learn about bad data, non-normal data.

Don't even think about learning what ridge regression is or
how to use it on a problem of this magnitude until you have
learned the basics of this problem.

Feel free to try flying to Mars on that homebuilt rocket. At
least I've told you the truth. If you need sugar coating on
that truth, that is your problem.

John

From: Josh on 27 Jun 2010 17:40

On Jun 27, 10:56 pm, "John D'Errico" <woodch...(a)rochester.rr.com>
wrote:
> J B <trifinit...(a)googlemail.com> wrote in message <d5bd8af4-d168-41fb-923b-1487ccf8b...(a)u26g2000yqu.googlegroups.com>...
> > On Jun 27, 8:18 pm, "John D'Errico" <woodch...(a)rochester.rr.com>
> > wrote:
> > > J B <trifinit...(a)googlemail.com> wrote in message <e1e9378c-dc91-4381-b967-e8c027284...(a)d16g2000yqb.googlegroups.com>...
> > > > Hello,
>
> > > > I wish to use ridge regression to learn a mapping between data that is
> > > > of 100 dimensions and another that is of 22 dimensions.
>
> > > This is an utterly, completely, absolutely, foolish task.
>
> > > Unless perhaps the model will be a first order model, so
> > > the model would have only 101 terms in it. But even a
> > > full quadratic polynomial model for 100 dimensional data
> > > will have roughly 5,000 coefficients to estimate.
>
> > > I find it amazing the foolish things attempted by people
> > > who have no concept at all of modeling. Throwing ridge
> > > regression at the problem cannot make a silk purse from
> > > a sow's ear.
>
> > > John
>
> > Is someone able to provide some practicable advice in relation to my
> > questions regarding polynomial fitting with ridge regression?
>
> > John, maybe I didn't make myself abundantly clear — I am no expert in
> > this matter; far from it. I have posted a genuine question that I am
> > hoping to have either answered or critiqued. I welcome criticism but
> > your reply is clearly an attempt to offend rather than inform.
>
> No. I am merely trying to tell you that this is a hopeless
> task. If nobody ever tells you that, then you will continue
> on this foolish quest, and then wonder what it is that you
> are doing wrong.
>
> I'm not trying to offend you. But the fact is, it is obvious
> that you do not understand the subject. Suppose I decided
> to attempt to fly a homebuilt rocket to Mars? I'm no expert
> in the area, so it might be a good idea for someone at some
> time to tell me that my goal is a foolish one, especially if I
> plan on riding that home-built rocket myself.
>
> You don't say what order model you contemplate using.
> But you show a 4th order model. So do you realize that
> say a 4th order polynomial model on a 100 dimensional
> space requires the estimation of
>
> nchoosek(100,4)
> ans =
> 3921225
>
> terms? And those are only the 4th order terms. Don't leave
> out the 161700 cubic terms, the 4950 quadratic terms, the
> 100 linear terms, and a constant in that model. Do you have
> enough data to justify a model this size? Do you understand
> how much data you need to estimate that model? I don't
> think so. Do you have adequate data to have a chance of
> estimating that model?
>
> You clearly don't understand ridge regression, or why/when
> it is appropriate to use. But a ridge regression will not help
> you to solve this problem. Just because the you can form a
> non-singular matrix does not suggest the coefficients from
> that ridge estimator will mean anything at all.
>
> Use of a ridge estimator to bias the coefficients of a model
> towards zero does exactly that. It simply biases the coefficients
> towards zero. But there is no reason to presume that you will
> know how much bias to apply.
>
> You asked for a critique. This is it. Before you try to use a
> ridge regression, why not learn how to do a simple regression
> in the first place? Learn how to build a model with a constant
> term in the model! Learn how to do this regression modeling
> using backslash, rather than just throwing the statistics toolbox
> at it, without even understanding the basics of regression.
>
> Learn how much data you should expect to need for a model.
> Learn what good data looks like. Learn how a perfectly good
> model can be corrupted by something as simple as poor
> scaling of your data. Learn what high leverage points are.
> Learn about bad data, non-normal data.
>
> Don't even think about learning what ridge regression is or
> how to use it on a problem of this magnitude until you have
> learned the basics of this problem.
>
> Feel free to try flying to Mars on that homebuilt rocket. At
> least I've told you the truth. If you need sugar coating on
> that truth, that is your problem.
>
> John

The task that I am undertaking is reproducing the work of Agarwal and
Triggs outlined in their paper "Recovering 3D Human Pose from
Monocular Images". Their paper presents an approach for recovering
pose (55D vectors) from silhouette images (100D vectors) using ridge
regression.

Yet, you say this is a hopeless task. Can you please explain the
discrepancy?

| Next | Last
Pages: 1 2 3
Prev: What is the problem with my equalizer?
Next: change diagonal to 1