From: Vivek Saxena on
Hi,

Is it possible to perform a linear regression in MATLAB with no constant term?

I have data for 9 regressors and I have to fit a multiple linear regression model of Y (the response) on these 9 regressors without an intercept. That is,

Y = x_1*gamma_1 + x_2*gamma_2 + ..... + x_9*gamma_9 + epsilon

I noticed that regstats automatically appends a column of 1s to the X matrix (corresponding to the 0th regression coefficient being the intercept in the usual formulation), whereas regress assumes that the input X matrix already has such a structure. The documentation states that regress will produce an incorrect model if the constant term is not present.

Thanks

Cheers
Vivek.
From: Peter Perkins on
On 3/24/2010 8:03 AM, Vivek Saxena wrote:
> I noticed that regstats automatically appends a column of 1s to the X
> matrix (corresponding to the 0th regression coefficient being the
> intercept in the usual formulation),

It's true that by passing in 'linear' to REGSTATS, you do get an intercept term, but you can specify any model you want using a terms matrix. In you case, you want a linear term for each of 9 predictors, no intercept or interactions, and no higher order terms, so the terms matrix is just eye(9).


> whereas regress assumes that the
> input X matrix already has such a structure. The documentation states
> that regress will produce an incorrect model if the constant term is not
> present.

I think you're referring to this:

X should include a column of ones so that the model contains a constant
term. The F statistic and p value are computed under the assumption
that the model contains a constant term, and they are not correct for
models without a constant. The R-square value is one minus the ratio of
the error sum of squares to the total sum of squares. This value can
be negative for models without a constant, which indicates that the
model is not appropriate for the data.

The model itself, i.e., the estimated coefficients and their CIs, are estimated correctly when the model does not include an intercept. It's only the F statistic and the R^2 that become invalid when there's no intercept. Both of these goodness-of-fit statistics assume that the model y = constant + error is a special case of the model you're fitting, and if there's no intercept, it isn't.

Another possibility is to use LSCOV.

Hope this helps.
From: Jos (10584) on
"Vivek Saxena" <maverick280857(a)yahoo.com> wrote in message <hocv1p$gep$1(a)fred.mathworks.com>...
> Hi,
>
> Is it possible to perform a linear regression in MATLAB with no constant term?
>
> I have data for 9 regressors and I have to fit a multiple linear regression model of Y (the response) on these 9 regressors without an intercept. That is,
>
> Y = x_1*gamma_1 + x_2*gamma_2 + ..... + x_9*gamma_9 + epsilon
>
> I noticed that regstats automatically appends a column of 1s to the X matrix (corresponding to the 0th regression coefficient being the intercept in the usual formulation), whereas regress assumes that the input X matrix already has such a structure. The documentation states that regress will produce an incorrect model if the constant term is not present.
>
> Thanks
>
> Cheers
> Vivek.

Construct a regression matrix without a column of ones. Example:

% data
x1 = cumsum(rand(1,10)) ;
x2 = cumsum(rand(size(x1))) ;
CF = [20 50] ;
y = CF(1) * x1 + CF(2) * x2 + randn(size(x1))/10 ;

%engine
M = [x1(:) x2(:)]
fittedCF = M \ y(:)

hth
Jos
From: Torsten Hennig on
> Hi,
>
> Is it possible to perform a linear regression in
> MATLAB with no constant term?
>
> I have data for 9 regressors and I have to fit a
> multiple linear regression model of Y (the response)
> on these 9 regressors without an intercept. That is,
>
> Y = x_1*gamma_1 + x_2*gamma_2 + ..... + x_9*gamma_9 +
> epsilon
>
> I noticed that regstats automatically appends a
> column of 1s to the X matrix (corresponding to the
> 0th regression coefficient being the intercept in the
> usual formulation), whereas regress assumes that the
> input X matrix already has such a structure. The
> documentation states that regress will produce an
> incorrect model if the constant term is not present.
>
> Thanks
>
> Cheers
> Vivek.

Say you have measurements
(x_1)_i,...,(x_9)_i, y_i (i=1,...,n).
Define a matrix A with n rows and 9 columns by
A(i,j) = (x_j)_i (j=1,...,9 ; i=1,...,n))
Define a vector b by
b(i) = y_i (i=1,...,n).
Then the MATLAB command
gamma = A\b
gives your regression coefficients gamma_j.

Best wishes
Torsten.
From: Vivek Saxena on
Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <hocvur$15d$1(a)fred.mathworks.com>...
>
> The model itself, i.e., the estimated coefficients and their CIs, are estimated correctly when the model does not include an intercept. It's only the F statistic and the R^2 that become invalid when there's no intercept. Both of these goodness-of-fit statistics assume that the model y = constant + error is a special case of the model you're fitting, and if there's no intercept, it isn't.

Thanks for your reply Peter. Usually when multicollinearity is to be detected and removed, one begins with a unit length model (centered and scaled), which contains no constant term. [At least that is what we have been taught.] Does MATLAB include a command for standardizing the regression model?

Also, if the design matrix input to REGSTATS is of the form [x11, x12, ...; x21, x22, ...], how does REGSTATS know whether or not a constant term exists? You say that the estimated coefficients and their CIs are estimated correctly even when the model does not include an intercept. But, the models are entirely different in the two cases. How do I know that beta(1) is not an intercept, but the regression coefficient for x1?
 |  Next  |  Last
Pages: 1 2
Prev: plot maple plot in matlab
Next: axis off + label