From: Sumeet T on 3 Aug 2010 17:18 Hi, I have huge data set comprising of three vectors say X, Y, Z. The vector 'Z' is non linearly dependent on X,Y. Each vector contains about 10K elements. I wish to obtain the best fit for this data while at the same time trying not to make this fit 'perfect'. By perfect I mean that the data points which are too far off/scattered may be ignored. I would then like to measure the scatter. Ignoring of some data points is helpful to establish a simple fit as compared to a complex fit obtained by including the widely scattered points. Such a complex fit would not be of much use to me, as it becomes case specific and may not be used elsewhere. I am struggling to get started on this as I have not used optimization toolbox in the past. I would appreciate feedback and assistance from members of the mathwork community. Thanks so much.
From: TideMan on 3 Aug 2010 17:35 On Aug 4, 9:18 am, "Sumeet T" <sumeettre...(a)gmail.com> wrote: > Hi, > > I have huge data set comprising of three vectors say X, Y, Z. The vector 'Z' is non linearly dependent on X,Y. Each vector contains about 10K elements. > > I wish to obtain the best fit for this data while at the same time trying not to make this fit 'perfect'. By perfect I mean that the data points which are too far off/scattered may be ignored. I would then like to measure the scatter. Ignoring of some data points is helpful to establish a simple fit as compared to a complex fit obtained by including the widely scattered points. Such a complex fit would not be of much use to me, as it becomes case specific and may not be used elsewhere. > > I am struggling to get started on this as I have not used optimization toolbox in the past. I would appreciate feedback and assistance from members of the mathwork community. > > Thanks so much. First of all, 3 vectors of 10,000 elements each is not a "huge dataset". After all, there are 86,400 s in a day, so 10K elements is much less one day's data at 1 Hz. Before you can decide which points are outliers that need to be ignored, you need a model. I'm not sure why you want to use the optimisation toolbox in preference to mldivide, where you could fit a linear model like this: coef=[X Y ones(length(X),1)]\Z; (Note: this could be extended to a nonlinear model simply by including terms like X.^2 and so on) Now, you can figure out which points are outliers, set them to NaN, and repeat on the good data.
|
Pages: 1 Prev: Accessing dynamic java classpath in Java classes Next: PCA |