From: Gulcin Tekin on
Thank u very much for your answer, Peter,
Can I ask u smthng more?
If I determine the R-square value of these above mentioned distribution functions, Can I say that; the distribution function which has the smaller R-square value is the most appropriate one for my data?
best regards,
Gülçin




Peter Perkins <Peter.Perkins(a)MathRemoveThisWorks.com> wrote in message <hmrjfm$jcr$1(a)fred.mathworks.com>...
> On 3/5/2010 3:18 AM, Bo?azici university Tekin wrote:
> > Therefore, My question is to find the best appriopriate distribution
> > function numerically, not only by looking at the graphs. In other words,
> > I want to determine the best distribution function that fits my data
> > (the best fit for my data). Which one is the best one, Exponential,
> > Weibull, Rayleigh or Normal PDF (probability densty function or
> > distribution function)?
>
> While the Statistics Toolbox does have a number of functions to test goodness of fit, this is less of a MATLAB question and more of a statistics question. It might be something to ask on sci.stat.math. Unfortunately, the answer you will get from most statisticians is along the lines of, "what do you mean by "best"?" And they are right, but still, it's not an unreasonable question for you to ask.
>
> There are things such as the Kolmogorov-Smirnov test (KSTEST) that are intended to test against a specific know distribution, but that's not what yo are doing. There are things like Lilliefor's test (LILLIETEST) that are intended to test against a family of distributions, but they only exist (to my knowledge) for a few distributions. There's the chi-squared test, which you might try.
>
> But In my opinion you were on the right track just by looking at plots. In particular, the DFITTOOL GUI allows you to overlay your data with any number of fitted distributions. CDF plots ar generally the best to use for this. Then you can see how each fitted model captures or fails to capture your data. A simple GOF test isn't going to do that.
>
> Hope this helps.
From: dpb on
Gulcin Tekin wrote:
> Thank u very much for your answer, Peter,
> Can I ask u smthng more?
> If I determine the R-square value of these above mentioned distribution
> functions, Can I say that; the distribution function which has the
> smaller R-square value is the most appropriate one for my data?
> best regards, Gülçin
....

You could and it might be for some definition of "appropriate".

Depends on what's the critical end result. I could foresee a better
overall Rsq but a poorer fit in the tails from which to estimate, for
example.

I think there's too little information to which to make any unequivocal
response.

I agree w/ the idea of the plotting; particularly if do so in
appropriate cdf-space rather than just raw data. I'd recommend Hahn &
Shapiro, _Statistical_Methods_in_Engineering_ (Wiley) as an containing
approachable chapters on probability plotting and tests for
distributional assumptions.

--
From: Gulcin Tekin on
Thanks for your answer, so it will be good to determine which one of these four distributions fit my data best only visually. there is no other way to determine it numerically....
And you request to figure the CDF of these four distributions (Normal, Rayleigh, Exponential and Weibull) versus the real data, and then determine the best fit only by looking at the plots, is it right?
Thank u very much.
Best regards,
Gülcin


dpb <none(a)non.net> wrote in message <hmro0l$h6g$1(a)news.eternal-september.org>...
> Gulcin Tekin wrote:
> > Thank u very much for your answer, Peter,
> > Can I ask u smthng more?
> > If I determine the R-square value of these above mentioned distribution
> > functions, Can I say that; the distribution function which has the
> > smaller R-square value is the most appropriate one for my data?
> > best regards, Gülçin
> ...
>
> You could and it might be for some definition of "appropriate".
>
> Depends on what's the critical end result. I could foresee a better
> overall Rsq but a poorer fit in the tails from which to estimate, for
> example.
>
> I think there's too little information to which to make any unequivocal
> response.
>
> I agree w/ the idea of the plotting; particularly if do so in
> appropriate cdf-space rather than just raw data. I'd recommend Hahn &
> Shapiro, _Statistical_Methods_in_Engineering_ (Wiley) as an containing
> approachable chapters on probability plotting and tests for
> distributional assumptions.
>
> --
From: Peter Perkins on
On 3/5/2010 2:00 PM, Gulcin Tekin wrote:

> If I determine the R-square value of these above mentioned distribution
> functions, Can I say that; the distribution function which has the
> smaller R-square value is the most appropriate one for my data?

R^2 is a statistic used for regression models, and has nothing at all to do with the kind of distribution fitting that I understood that you are doing.
From: dpb on
Gulcin Tekin wrote:
> Thanks for your answer, so it will be good to determine which one of
> these four distributions fit my data best only visually. there is no
> other way to determine it numerically....

That's not exactly what I said...

I didn't say that there is no way to make some comparisons numerically;
I said there isn't enough information in your postings to provide
unambiguous answers.

> And you request to figure the CDF of these four distributions (Normal,
> Rayleigh, Exponential and Weibull) versus the real data, and then
> determine the best fit only by looking at the plots, is it right?
....

Not necessarily only; if a test of a null hypothesis of one or more
sample sets of data was contraindicated by a test statistic appropriate
for the assumption, one could use that in aiding in decisions. Of
course, one would probably find that data didn't visually fit the
postulated distribution very well also, but it doesn't have to be the
_only_ criterion.

I'd suggest reading some or consulting w/ your advisor or uni statistics
consulting group or both. As noted above, there's some pretty
accessible information in Hahn and Shapiro that would, I think, be
enlightening.

--
First  |  Prev  |  Next  |  Last
Pages: 1 2 3
Prev: gradient
Next: Extract all non zero rows from array