From: porky_pig_jr on
On Apr 5, 2:03 pm, Ludovicus <luir...(a)yahoo.com> wrote:
> On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote:
>
> > Hi,
>
> > I am measuring the time spent by an algorithm. Let's assume it is a
> > gaussian-distributed random variable. How many repetitions do I have to
> > make to get a good estimate of the mean and standard deviation of this
> > distribution?
>
> > I'd like a cook-book answer this time, because I am in hurry with these
> > measurements. I know it's 101.. Probability is one of my weaker sides.
>
> > --http://kaba.hilvi.org
>
> How can, the time spent by an algorith, be a random variable ?
> If it is run in the same computer the time is a constant.
> If it is run in different computers the values are not of the same
> genus.

In a modern multitasking operating system chances are you *won't* get
the same results all the time, even if no other user tasks are
running. You have to insure that your OS housekeeping tasks (whatever
they are) are not running. You can kill some of them, but not all of
them. So YMMV. Only on a real-time OS you can get really tight bounds.
So under the regular circumstances (say, you run your algorithm on
Windows or OSX or Linux, I would expect some variability. In fact I
recall doing something like that and yes, I never got the same
result.

But I'm not sure if the normal distribution is applicable in this
case. Come to think of it: how does the normal distribution arises in
"real life"? Say, why the height is considered to be normally
distributed? Well, many factors, all roughly equally weighted, working
together contribute to the height, so one can say it is indirectly the
result of CLT. But it is quite different situation if we're running
the algorithm, insured no other user tasks are active at the same time
and the only variability we can expect is from OS, not really
significant. Now the best result we are going to get is the algorithm
runs on some RT OS. Say 10ms. On a regular OS we will never get
anything faster than that. We probably will get something slower. Me
thinks running the same algorithm as many times as possible and
considering the best time as the estimator of the "true running time"
is the best you can do.

Well, for the heck of it, you can also plot the running times, to see
if you get anything like a bell curve. Which I doubt you will.

PPJ
From: Dirk Van de moortel on
Kaba <none(a)here.com> wrote in message
MPG.2624580e11eba1f9898b1(a)news.cc.tut.fi
> Dirk Van de moortel wrote:
>> If you're in a hurry, then let it run and make it show the mean and standard
>> deviation, together with the current repetition count, while it's running. Stop
>> it when the mean and stddev values settle down. They are what you are
>> looking for. If you're still interested in the number of repetitions, read it
>> as wel; :-)
>>
>> Dirk Vdm
>
> Thanks, I'll consider whether this fits the situation.
>
> Let's put the question this way. Currently, using a sample of 10
> realizations, I am having a sample standard deviation which is about 2%
> of the sample mean. What does this allow me to say about the quality of
> the sample mean (assuming gaussianity)?

No idea.
It's 30 years since I took a course on this stuff.
Rusty :-)

Dirk Vdm


From: Kaba on
Dirk Van de moortel wrote:
> No idea.
> It's 30 years since I took a course on this stuff.
> Rusty :-)
>
> Dirk Vdm

Ok:) Lets wait then if someone else can comment.

--
http://kaba.hilvi.org
From: Kaba on
porky_pig_jr(a)my-deja.com wrote:
> Me thinks running the same algorithm as many times as possible and
> considering the best time as the estimator of the "true running time"
> is the best you can do.

I would agree, if the OS was the only variation. However, I also vary
the input pseudo-randomly (similar but not identical input). I am
assuming that the variation caused by OS is negligible.

--
http://kaba.hilvi.org
From: Pubkeybreaker on
On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote:
> Hi,
>
> I am measuring the time spent by an algorithm. Let's assume it is a
> gaussian-distributed random variable. How many repetitions do I have to
> make to get a good estimate of the mean and standard deviation of this
> distribution?

You need to define what you mean by "good".