Time spent by an algorithm [Math]

Prev: Most *elementary* Q on quotient map of topological spaces
Next: On Coast to Coast AM this morning, the topic was the Shroud of Turin. ... bend, fold, staple, and mutilate.

From: porky_pig_jr on 5 Apr 2010 14:29

On Apr 5, 2:03 pm, Ludovicus <luir...(a)yahoo.com> wrote:
> On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote:
>
> > Hi,
>
> > I am measuring the time spent by an algorithm. Let's assume it is a
> > gaussian-distributed random variable. How many repetitions do I have to
> > make to get a good estimate of the mean and standard deviation of this
> > distribution?
>
> > I'd like a cook-book answer this time, because I am in hurry with these
> > measurements. I know it's 101.. Probability is one of my weaker sides.
>
> > --http://kaba.hilvi.org
>
> How can, the time spent by an algorith, be a random variable ?
> If it is run in the same computer the time is a constant.
> If it is run in different computers the values are not of the same
> genus.

In a modern multitasking operating system chances are you *won't* get
the same results all the time, even if no other user tasks are
running. You have to insure that your OS housekeeping tasks (whatever
they are) are not running. You can kill some of them, but not all of
them. So YMMV. Only on a real-time OS you can get really tight bounds.
So under the regular circumstances (say, you run your algorithm on
Windows or OSX or Linux, I would expect some variability. In fact I
recall doing something like that and yes, I never got the same
result.

But I'm not sure if the normal distribution is applicable in this
case. Come to think of it: how does the normal distribution arises in
"real life"? Say, why the height is considered to be normally
distributed? Well, many factors, all roughly equally weighted, working
together contribute to the height, so one can say it is indirectly the
result of CLT. But it is quite different situation if we're running
the algorithm, insured no other user tasks are active at the same time
and the only variability we can expect is from OS, not really
significant. Now the best result we are going to get is the algorithm
runs on some RT OS. Say 10ms. On a regular OS we will never get
anything faster than that. We probably will get something slower. Me
thinks running the same algorithm as many times as possible and
considering the best time as the estimator of the "true running time"
is the best you can do.

Well, for the heck of it, you can also plot the running times, to see
if you get anything like a bell curve. Which I doubt you will.

PPJ

From: Dirk Van de moortel on 5 Apr 2010 14:30

Kaba <none(a)here.com> wrote in message
MPG.2624580e11eba1f9898b1(a)news.cc.tut.fi
> Dirk Van de moortel wrote:
>> If you're in a hurry, then let it run and make it show the mean and standard
>> deviation, together with the current repetition count, while it's running. Stop
>> it when the mean and stddev values settle down. They are what you are
>> looking for. If you're still interested in the number of repetitions, read it
>> as wel; :-)
>>
>> Dirk Vdm
>
> Thanks, I'll consider whether this fits the situation.
>
> Let's put the question this way. Currently, using a sample of 10
> realizations, I am having a sample standard deviation which is about 2%
> of the sample mean. What does this allow me to say about the quality of
> the sample mean (assuming gaussianity)?

No idea.
It's 30 years since I took a course on this stuff.
Rusty :-)

Dirk Vdm

From: Kaba on 5 Apr 2010 14:47

Dirk Van de moortel wrote:
> No idea.
> It's 30 years since I took a course on this stuff.
> Rusty :-)
>
> Dirk Vdm

Ok:) Lets wait then if someone else can comment.

--
http://kaba.hilvi.org

From: Kaba on 5 Apr 2010 14:55

porky_pig_jr(a)my-deja.com wrote:
> Me thinks running the same algorithm as many times as possible and
> considering the best time as the estimator of the "true running time"
> is the best you can do.

I would agree, if the OS was the only variation. However, I also vary
the input pseudo-randomly (similar but not identical input). I am
assuming that the variation caused by OS is negligible.

--
http://kaba.hilvi.org

From: Pubkeybreaker on 5 Apr 2010 15:25

On Apr 5, 11:00 am, Kaba <n...(a)here.com> wrote:
> Hi,
>
> I am measuring the time spent by an algorithm. Let's assume it is a
> gaussian-distributed random variable. How many repetitions do I have to
> make to get a good estimate of the mean and standard deviation of this
> distribution?

You need to define what you mean by "good".

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Most *elementary* Q on quotient map of topological spaces
Next: On Coast to Coast AM this morning, the topic was the Shroud of Turin. ... bend, fold, staple, and mutilate.