From: Helmut Giese on 15 May 2010 04:10 Hello out there, about a week ago there was a thread clock microseconds with resolution in milliseconds in which it was observed that 'clock microseconds' only returned a millisecond resolution (under Windows). The consensus at the end was that Tcl can only get what the underlying OS offers. To me this was worrying because I had used the high resolution timing in the past, and who knows? The need may arise again. But only now did I have time to investigate (nothing beats a holiday with bad wheather to engage in this kind of activity) and it appears to be broken. Here's the issue: a) Tcl asks Windows if a performance counter exists. b) If it exists but its frequency is > 15 MHz Tcl performs additional checks and if these fail (and they apparently do on newer machines) Tcl decides not to use the counter. I filed a bug at SF and invite anybody with some knowledge wrt the performance counter to add their comments (bug id 3002022) - it may help the maintainers to resolve the bug. For those who need this feature I can offer a work around. The following test script will show you - the performanc counter's current value and - what 'clock microseconds' does: is it stuck at every millisecond or does it "move". --- package require Ffidl # Define two Ffidl functions ffidl::callout queryFrequ {pointer-var} int \ [ffidl::symbol kernel32.dll QueryPerformanceFrequency] ffidl::callout queryCount {pointer-var} int \ [ffidl::symbol kernel32.dll QueryPerformanceCounter] # # getPerfCnt Get the value of the performance counter # proc getPerfCnt {} { set i64 [binary format w 0] queryCount i64 binary scan $i64 w cnt return $cnt } # # getTimeVals Collect N counter values # proc getTimeVals {n} { for {set i 0} {$i < $n} {incr i} { lappend res [getPerfCnt] } return $res } # # getTimeVals2 Relate the perf counter to Tcl's clock # # In a loop collect [getPerfCnt], [clock clicks] and [clock microseconds]. # # Return list of N triples. # proc getTimeVals2 {n} { for {set i 0} {$i < $n} {incr i} { lappend res [list [getPerfCnt] [clock clicks]\ [clock microseconds]] } return $res } # Create a binary string suitable for a 'large integer' set i64 [binary format w 0] set res [queryFrequ i64] binary scan $i64 w f puts "Frequency: $f" puts "" # call the test procs once to have them byte-compiled getTimeVals 2 getTimeVals2 2 # Run the tests and show results set N 5 set cntLst [getTimeVals $N] puts "raw 64 bit counter" puts "------------------" for {set i 0} {$i < $N} {incr i} { puts "[format %16lu [lindex $cntLst $i]]" } # Test 2 set res [getTimeVals2 $N] puts "" puts "raw 64 bit counter clock clicks clock microseconds" puts "-------------------------------------------------------" for {set i 0} {$i < $N} {incr i} { puts "[format %16lu [lindex $res $i 0]]\ [format %16lu [lindex $res $i 1]]\ [format %20lu [lindex $res $i 2]]" } --- 'getPerfCnt' returns the counter's raw value. If you need absolute times or delays you would need to take the counter's frequency into account. It's not as nice as the original function (especially regarding the time it takes to just get the value), but then - it's just a work around. Best regards and have a nice weekend Helmut Giese
From: Helmut Giese on 16 May 2010 15:10 I would like to correct the subject: It's not broken, it's by 'design out of necessity'. Background (for those interested in technical details): Each core has a fast running 'performance counter' (aka 'high resolution timer'), which is at the base of every attempt to get sub-millisecond resolution. The problem is that on multicore machines those timers can (and apparently will) get out of sync. Since one cannot rely on always executing on the same core getting the counter value introduces an element of random: A value you get may (appear to) lie in the past - or in a distant future. It is for this reason that Tcl deliberately relinquishes use of this timer if a 'safe environment' cannot be determined. So far nobody seems to have found a satisfying solution. A word of WARNING: If anybody wants to use the "work around" I posted: Please be aware, that it is subject to the same problem: Sukzessive calls need not be executed on the same core - hence may report values from different timers - hence may produce surprising results. Sigh, sorry for the bad news. Helmut Giese
From: MartinLemburg on 25 May 2010 03:19 Hi Helmut, perhaps the solution we introduced in a non-tcl application for our timing measurements is acceptable for tcl applications, too. In the case, that we want our application to do profiling, the application binds itself to one CPU core via ... SetProcessAffinityMask(GetCurrentProcess(), 1) Perhaps, the tcl core could be able to switch to one CPU core usage, if "clock microseconds" is used the first time, to assure the correctness of the returned times. Best regards, Martin On 16 Mai, 21:10, Helmut Giese <hgi...(a)ratiosoft.com> wrote: > I would like to correct the subject: It's not broken, it's by 'design > out of necessity'. > Background (for those interested in technical details): Each core has > a fast running 'performance counter' (aka 'high resolution timer'), > which is at the base of every attempt to get sub-millisecond > resolution. > The problem is that on multicore machines those timers can (and > apparently will) get out of sync. Since one cannot rely on always > executing on the same core getting the counter value introduces an > element of random: A value you get may (appear to) lie in the past - > or in a distant future. > It is for this reason that Tcl deliberately relinquishes use of this > timer if a 'safe environment' cannot be determined. So far nobody > seems to have found a satisfying solution. > > A word of WARNING: If anybody wants to use the "work around" I posted: > Please be aware, that it is subject to the same problem: Sukzessive > calls need not be executed on the same core - hence may report values > from different timers - hence may produce surprising results. > Sigh, sorry for the bad news. > Helmut Giese
From: Donal K. Fellows on 25 May 2010 08:44 On 25 May, 08:19, "MartinLemburg(a)Siemens-PLM" <martin.lemburg.siemens-...(a)gmx.net> wrote: > perhaps the solution we introduced in a non-tcl application for our > timing measurements is acceptable for tcl applications, too. > > In the case, that we want our application to do profiling, the > application binds itself to one CPU core via ... > > SetProcessAffinityMask(GetCurrentProcess(), 1) > > Perhaps, the tcl core could be able to switch to one CPU core usage, > if "clock microseconds" is used the first time, to assure the > correctness of the returned times. We're not about to do that; it would utterly ruin performance on multiprocessor systems. Better to have somewhat less accurate timing. (You could make your own tclsh that did this though, just setting the affinity and then calling Tcl_Main...) Donal.
From: Helmut Giese on 25 May 2010 11:28 Hi Martin, >perhaps the solution we introduced in a non-tcl application for our >timing measurements is acceptable for tcl applications, too. > >In the case, that we want our application to do profiling, the >application binds itself to one CPU core via ... > > SetProcessAffinityMask(GetCurrentProcess(), 1) > >Perhaps, the tcl core could be able to switch to one CPU core usage, >if "clock microseconds" is used the first time, to assure the >correctness of the returned times. see Donal's reply. This question still occupies me, but since nobody has fund a satisfactory solution so far, it is somewhat unlikely that I will be able to come up with one. This is really annoying: We got technological advancements in the form of - high resolution timers and - multi-core machines but cannot use them together. Best regards Helmut Giese > >Best regards, > >Martin > >On 16 Mai, 21:10, Helmut Giese <hgi...(a)ratiosoft.com> wrote: >> I would like to correct the subject: It's not broken, it's by 'design >> out of necessity'. >> Background (for those interested in technical details): Each core has >> a fast running 'performance counter' (aka 'high resolution timer'), >> which is at the base of every attempt to get sub-millisecond >> resolution. >> The problem is that on multicore machines those timers can (and >> apparently will) get out of sync. Since one cannot rely on always >> executing on the same core getting the counter value introduces an >> element of random: A value you get may (appear to) lie in the past - >> or in a distant future. >> It is for this reason that Tcl deliberately relinquishes use of this >> timer if a 'safe environment' cannot be determined. So far nobody >> seems to have found a satisfying solution. >> >> A word of WARNING: If anybody wants to use the "work around" I posted: >> Please be aware, that it is subject to the same problem: Sukzessive >> calls need not be executed on the same core - hence may report values >> from different timers - hence may produce surprising results. >> Sigh, sorry for the bad news. >> Helmut Giese
|
Pages: 1 Prev: Tcl-URL! - weekly Tcl news and links (May 14) Next: Getting the names of all directories |