From: Greg Heath on 29 Mar 2010 18:53 On Mar 29, 4:19 pm, rams <lrams...(a)gmail.com> wrote: > Hi Walter roberson, > > I don't have the model but when i plot dependent variable individually with each of independent variables then i can fit them with 3rd order polynomial.Will that be helpful in finding the multiple regression equation....thanks in advance.... There is no "THE" model. 1.You propose a model based on prior information or plain ignorance 2. Quantify the goodness of fit e.g., mean-square-error 3. Either accept the result, or propose another model and go to 1. Since you know that good single variable third order fits are reasonable, you could try linear, quadratic and third order models using all of the variables. However, the number of coefficients for the each model is Linear: 1 + 5 = 6 Quadratic: 6 + 5^2 = 31 Cubic: 31 + 5^3 = 156 As a rule of thumb, you would like at least 10 times as many data points as coefficients to estimate. Therefore if the quadratic fit (See my thread "Vectorization for Quadratic Polynomial Regression") is not satisfactory, you might consider a neural network. Hope this helps. Greg many data points as coefficients to estimate.
From: the cyclist on 30 Mar 2010 09:29 Greg Heath <heath(a)alumni.brown.edu> wrote in message <c5b110b4-776a-4c56-94ae-93b38e9e1b4a(a)k24g2000pro.googlegroups.com>... > On Mar 29, 4:19 pm, rams <lrams...(a)gmail.com> wrote: > > Hi Walter roberson, > > > > I don't have the model but when i plot dependent variable individually with each of independent variables then i can fit them with 3rd order polynomial.Will that be helpful in finding the multiple regression equation....thanks in advance.... > > There is no "THE" model. > 1.You propose a model based on prior information or plain ignorance > 2. Quantify the goodness of fit e.g., mean-square-error > 3. Either accept the result, or propose another model and go to 1. > > Since you know that good single variable third order fits > are reasonable, you could try linear, quadratic and > third order models using all of the variables. > > However, the number of coefficients for the each model is > Linear: 1 + 5 = 6 > Quadratic: 6 + 5^2 = 31 > > Cubic: 31 + 5^3 = 156 > > As a rule of thumb, you would like at least 10 times as > many data points as coefficients to estimate. Therefore if the > quadratic fit (See my thread "Vectorization for Quadratic > Polynomial Regression") is not satisfactory, you might > consider a neural network. > > Hope this helps. > > Greg > many data points as coefficients to estimate. rams, Floating around this discussion, but not being stated outright, is that you should be aware of the "parsimony" of your model, and the dangers of overfitting. Even if you have the ~1560 data points that Greg suggests you would need to fit third-order polynomials in all combinations of your 5 independent variables, that does NOT mean that that is a good model. In general, more and more parameters lead to a better fit, but at some point you are fitting the random noise, which does you no good. (That is overfitting.) I'm not an expert on these things, but I know there are tests that one can apply to models to assess these things. Probably searching on some of the keywords in this thread will help you. You might also try telling us a bit more about what you are trying to conceptually with your data, not just the math. the cyclist
From: rams on 30 Mar 2010 11:15 I have reflectance data modeled using 5 independent variables. I also have measured reflectance data that i collected in the field. Now i want fit modeled reflectance with measured reflectance by adjusting those 5 independent variables so that i might estimate those 5 parameters for measured data.
From: Greg Heath on 31 Mar 2010 04:49 On Mar 30, 9:29 am, "the cyclist" <thecycl...(a)gmail.com> wrote: > Greg Heath <he...(a)alumni.brown.edu> wrote in message <c5b110b4-776a-4c56-94ae-93b38e9e1...(a)k24g2000pro.googlegroups.com>... > > On Mar 29, 4:19 pm, rams <lrams...(a)gmail.com> wrote: > > > Hi Walter roberson, > > > > I don't have the model but when i plot dependent variable individually with each of independent variables then i can fit them with 3rd order polynomial.Will that be helpful in finding > > > the multiple regression equation....thanks in advance.... > > > There is no "THE" model. > > 1.You propose a model based on prior information or plain ignorance > > 2. Quantify the goodness of fit e.g., mean-square-error > > 3. Either accept the result, or propose another model and go to 1. Correction: go to 2 > > Since you know that good single variable third order fits > > are reasonable, you could try linear, quadratic and > > third order models using all of the variables. > > > However, the number of coefficients for the each model is > > Linear: 1 + 5 = 6 > > Quadratic: 6 + 5^2 = 31 > > > Cubic: 31 + 5^3 = 156 > > > As a rule of thumb, you would like at least 10 times as > > many data points as coefficients to estimate. Therefore if the > > quadratic fit (See my thread "Vectorization for Quadratic > > Polynomial Regression") is not satisfactory, you might > > consider a neural network. > > Floating around this discussion, but not being stated outright, > is that you should be aware of the "parsimony" of your model, > and the dangers of overfitting. Even if you have the ~1560 data > points that Greg suggests you would need to fit third-order > polynomials in all combinations of your 5 independent variables, No, you don't have to do this. Very often You can use a stagewise search and obtain a nonoptimal model that is insignificantly worse than an optimal model. > that does NOT mean that that is a good model. Very true. Nor does it mean that a poor model will automatically result if you only have 1.5*156 = 234 data points. If Nw is the number of weights and thresholds to be estimated and Ntrn is the number of training vectors, the influence of measurement errors and inadequate data sampling causes the estimates to be less useful as the ratio r = Ntrn/Nw decreases. Typically, if r <= 1 (Ntrn <= Nw) solutions exists. However, those solutions will include the effects of design data measurement errors and inadequate sampling. Therefore the resulting weights may be useless for nondesign data. Typically, if r > 1 (Ntrn > Nw )zero error solutions don't exist. However, least-square-error and other approximate solutions are available. Moreover, their accuracy tends to increase as r increases because the Ntrn-Nw extra degrees of freedom allows training algoriths to average out weight estimate errors caused by the sparseness of random sampling and the existence of random measurement error. Often a search is made to determine a lower bound for a good choice of r. When good results are obtained, I generally find that ~2 <= r <= ~ 32. Therefore, I tend to begin my search with r ~ sqrt(2*32) ~ 8 or 10. >In general, more and more parameters lead to a better fit, >but at some point you are fitting the random noise, which >does you no good. (That is overfitting.) Overfitting means using more parametrs than are necessary. However, there are various ways to mitigate overfitting (e.g., Regularization and Stopped Training). If iterative training is not stopped just past the point where a validation nontraining design set achieves a minimum in mean-squared-error or other stopping criterion, the model is said to be overtrained. Bottom Line: Overfitting can be mitigated Overtraining an overfit model should be avoided. Hope this helps. Greg
From: Greg Heath on 31 Mar 2010 04:53 On Mar 30, 3:15 pm, rams <lrams...(a)gmail.com> wrote: > I have reflectance data modeled using 5 independent variables. I also have measured reflectance data that i collected in the field. Now i want fit modeled reflectance with measured reflectance by adjusting those 5 independent variables so that i might estimate those 5 parameters for measured data. In general, this appears to be dang near impossible. How do you propose to do this? Greg
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: Read data set metadata from HDF4 files? Next: parfor with sedumi or yalmip |