From: Gabriel Genellina on
En Wed, 10 Feb 2010 13:15:22 -0300, Grant Edwards
<invalid(a)invalid.invalid> escribi�:
> On 2010-02-09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote:
>> En Tue, 09 Feb 2010 13:10:56 -0300, Grant Edwards
>> <invalid(a)invalid.invalid> escribi?:
>>
>>> What's the correct way to measure small periods of elapsed
>>> time. I've always used time.clock() in the past:
>>> However on multi-processor machines that doesn't work.
>>> Sometimes I get negative values for delta. According to
>>> google, this is due to a bug in Windows that causes the value
>>> of time.clock() to be different depending on which core in a
>>> multi-core CPU you happen to be on. [insert appropriate
>>> MS-bashing here]
>>
>> I'm not sure you can blame MS of this issue; anyway, this
>> patch should fix the problem:
>> http://support.microsoft.com/?id=896256
>
> I'm curious why it wouldn't be Microsoft's fault, because
>
> A) Everything is Microsoft's fault. ;)
>
> B) If a patch to MS Windows fixes the problem, how is it not a
> problem in MS Windows?

I won't argue against A) because its truthness (?) is self-evident :)

99% of my code does not run in Python 3.x; I may fix it and it will
eventually run fine, but that doesn't mean it's *my* fault.

The original problem was with the RDTSC instruction on multicore CPUs;
different cores may yield different results because they're not
synchronized at all times.

Windows XP was launched in 2001, and the first dual core processors able
to execute Windows were AMD Opteron and IBM Pentium D, both launched
around April 2005 (and targeting the server market, not the home/desktop
market of Windows XP).
How could MS know in 2001 of a hardware issue that would happen four years
in the future?
Guido seems very jealous of his time machine and does not lend it to
anyone.

--
Gabriel Genellina

From: Grant Edwards on
On 2010-02-09, Gabriel Genellina <gagsl-py2(a)yahoo.com.ar> wrote:

> In code, using SetProcessAffinityMask and related functions:
> http://msdn.microsoft.com/en-us/library/ms686223(VS.85).aspx

That solves the problem. If I don't set the process affinity
mask, I regularly get measurements that are off by 6ms. I
presume 6ms is the skew between the two cores' performance
counter values. I don't know if that difference is stable or
how it varies, but now it doesn't matter. :)

--
Grant Edwards grante Yow! If our behavior is
at strict, we do not need fun!
visi.com
From: Tim Roberts on
"Gabriel Genellina" <gagsl-py2(a)yahoo.com.ar> wrote:
>
>The original problem was with the RDTSC instruction on multicore CPUs;
>different cores may yield different results because they're not
>synchronized at all times.

Not true. The synchronization issue has two causes: initial
synchronization at boot time, and power management making microscopic
adjustments to the clock rate. In Windows NT 4, Microsoft took extra pains
to adjust the cycle counters on multiprocessor computers during boot so
that the processors started out very close together. Once set, they
naturally remained in lock step, until aggressive power management because
more prevalent. In XP, they stopped trying to align at boot time.

>Windows XP was launched in 2001, and the first dual core processors able
>to execute Windows were AMD Opteron and IBM Pentium D, both launched
>around April 2005 (and targeting the server market, not the home/desktop
>market of Windows XP).
>How could MS know in 2001 of a hardware issue that would happen four years
>in the future?

No, you're underestimating the problem. The issue occurs just as much in
machines with multiple processor chips, which was supported clear back in
the original NT 3.1, 1992.
--
Tim Roberts, timr(a)probo.com
Providenza & Boekelheide, Inc.
From: Albert van der Horst on
In article <87404349-5d3a-4396-aeff-60edc14a506a(a)f8g2000yqn.googlegroups.com>,
Paul McGuire <ptmcg(a)austin.rr.com> wrote:
>On Feb 10, 2:24=A0am, Dennis Lee Bieber <wlfr...(a)ix.netcom.com> wrote:
>> On Tue, 9 Feb 2010 21:45:38 +0000 (UTC), Grant Edwards
>> <inva...(a)invalid.invalid> declaimed the following in
>> gmane.comp.python.general:
>>
>> > Doesn't work. =A0datetime.datetime.now has granularity of
>> > 15-16ms.
>>
>> > Intervals much less that that often come back with a delta of
>> > 0. =A0A delay of 20ms produces a delta of either 15-16ms or
>> > 31-32ms
>>
>> =A0 =A0 =A0 =A0 WinXP uses an ~15ms time quantum for task switching. Whic=
>h defines
>> the step rate of the wall clock output...
>>
>> http://www.eggheadcafe.com/software/aspnet/35546579/the-quantum-was-n...h=
>ttp://www.eggheadcafe.com/software/aspnet/32823760/how-do-you-set-ti...
>>
>> http://www.lochan.org/2005/keith-cl/useful/win32time.html
>> --
>> =A0 =A0 =A0 =A0 Wulfraed =A0 =A0 =A0 =A0 Dennis Lee Bieber =A0 =A0 =A0 =
>=A0 =A0 =A0 =A0 KD6MOG
>> =A0 =A0 =A0 =A0 wlfr...(a)ix.netcom.com =A0 =A0 HTTP://wlfraed.home.netcom.=
>com/
>
>Gabriel Genellina reports that time.clock() uses Windows'
>QueryPerformanceCounter() API, which has much higher resolution than
>the task switcher's 15ms. QueryPerformanceCounter's resolution is
>hardware-dependent; using the Win API, and a little test program, I
>get this value on my machine:
>Frequency is 3579545 ticks/sec
>Resolution is 0.279365114840015 microsecond/tick

In Forth we add a small machine code routine that executes the
RDTSC instruction. (I used that to play music on a couple of
mechanical instruments in real time.)
It just counts the (3 Ghz) clock cycles in a 64 bit timer.
Subtract two samples and you're done.

Is there a mechanism in Python to do something similar,
embedded assembler or something?

(This is not a general solution, but at least it would work on
Windows, that is i86 only.)

>
>-- Paul


--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert(a)spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

From: Grant Edwards on
On 2010-02-22, Albert van der Horst <albert(a)spenarnc.xs4all.nl> wrote:
> In article <87404349-5d3a-4396-aeff-60edc14a506a(a)f8g2000yqn.googlegroups.com>,

>>Gabriel Genellina reports that time.clock() uses Windows'
>>QueryPerformanceCounter() API, which has much higher resolution
>>than the task switcher's 15ms. QueryPerformanceCounter's
>>resolution is hardware-dependent; using the Win API, and a
>>little test program, I get this value on my machine: Frequency
>>is 3579545 ticks/sec Resolution is 0.279365114840015
>>microsecond/tick
>
> In Forth we add a small machine code routine that executes the
> RDTSC instruction. (I used that to play music on a couple of
> mechanical instruments in real time.) It just counts the (3
> Ghz) clock cycles in a 64 bit timer.

That's what clock.clock() does, except that it converts it into
a floating point value in seconds.

> Subtract two samples and you're done.

Nope. It would fail the same way that clock.clock() does on a
multi-core Windows machine.

> Is there a mechanism in Python to do something similar,
> embedded assembler or something?

You'd get the same results as using clock.clock(). Just
different format/units.

> (This is not a general solution, but at least it would work on
> Windows, that is i86 only.)

It fails on Windows for the same reason that clock.clock()
fails: the counters read by the RDTSC instruction are not
synchronized between the different cores.

--
Grant Edwards grante Yow! I'm a nuclear
at submarine under the
visi.com polar ice cap and I need
a Kleenex!