Prev: Most *elementary* Q on quotient map of topological spaces
Next: On Coast to Coast AM this morning, the topic was the Shroud of Turin. ... bend, fold, staple, and mutilate.
From: porky_pig_jr on 5 Apr 2010 14:29 On Apr 5, 2:03 pm, Ludovicus <luir...(a)yahoo.com> wrote: > On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote: > > > Hi, > > > I am measuring the time spent by an algorithm. Let's assume it is a > > gaussian-distributed random variable. How many repetitions do I have to > > make to get a good estimate of the mean and standard deviation of this > > distribution? > > > I'd like a cook-book answer this time, because I am in hurry with these > > measurements. I know it's 101.. Probability is one of my weaker sides. > > > --http://kaba.hilvi.org > > How can, the time spent by an algorith, be a random variable ? > If it is run in the same computer the time is a constant. > If it is run in different computers the values are not of the same > genus. In a modern multitasking operating system chances are you *won't* get the same results all the time, even if no other user tasks are running. You have to insure that your OS housekeeping tasks (whatever they are) are not running. You can kill some of them, but not all of them. So YMMV. Only on a real-time OS you can get really tight bounds. So under the regular circumstances (say, you run your algorithm on Windows or OSX or Linux, I would expect some variability. In fact I recall doing something like that and yes, I never got the same result. But I'm not sure if the normal distribution is applicable in this case. Come to think of it: how does the normal distribution arises in "real life"? Say, why the height is considered to be normally distributed? Well, many factors, all roughly equally weighted, working together contribute to the height, so one can say it is indirectly the result of CLT. But it is quite different situation if we're running the algorithm, insured no other user tasks are active at the same time and the only variability we can expect is from OS, not really significant. Now the best result we are going to get is the algorithm runs on some RT OS. Say 10ms. On a regular OS we will never get anything faster than that. We probably will get something slower. Me thinks running the same algorithm as many times as possible and considering the best time as the estimator of the "true running time" is the best you can do. Well, for the heck of it, you can also plot the running times, to see if you get anything like a bell curve. Which I doubt you will. PPJ
From: Dirk Van de moortel on 5 Apr 2010 14:30 Kaba <none(a)here.com> wrote in message MPG.2624580e11eba1f9898b1(a)news.cc.tut.fi > Dirk Van de moortel wrote: >> If you're in a hurry, then let it run and make it show the mean and standard >> deviation, together with the current repetition count, while it's running. Stop >> it when the mean and stddev values settle down. They are what you are >> looking for. If you're still interested in the number of repetitions, read it >> as wel; :-) >> >> Dirk Vdm > > Thanks, I'll consider whether this fits the situation. > > Let's put the question this way. Currently, using a sample of 10 > realizations, I am having a sample standard deviation which is about 2% > of the sample mean. What does this allow me to say about the quality of > the sample mean (assuming gaussianity)? No idea. It's 30 years since I took a course on this stuff. Rusty :-) Dirk Vdm
From: Kaba on 5 Apr 2010 14:47 Dirk Van de moortel wrote: > No idea. > It's 30 years since I took a course on this stuff. > Rusty :-) > > Dirk Vdm Ok:) Lets wait then if someone else can comment. -- http://kaba.hilvi.org
From: Kaba on 5 Apr 2010 14:55 porky_pig_jr(a)my-deja.com wrote: > Me thinks running the same algorithm as many times as possible and > considering the best time as the estimator of the "true running time" > is the best you can do. I would agree, if the OS was the only variation. However, I also vary the input pseudo-randomly (similar but not identical input). I am assuming that the variation caused by OS is negligible. -- http://kaba.hilvi.org
From: Pubkeybreaker on 5 Apr 2010 15:25
On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote: > Hi, > > I am measuring the time spent by an algorithm. Let's assume it is a > gaussian-distributed random variable. How many repetitions do I have to > make to get a good estimate of the mean and standard deviation of this > distribution? You need to define what you mean by "good". |