From: Alexandre Ferrieux on
On May 4, 6:19 pm, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com>
wrote:
> On May 4, 5:09 pm, Uwe Klein <uwe_klein_habertw...(a)t-online.de> wrote:
>
> > Uwe Klein wrote:
> > > gettimeofday({1272981751, 276719}, NULL) = 0
> > > ...
> > > ~ about _330 similar lines at intervals of ~270us_ snipped.
> > > ...
> > > gettimeofday({1272981751, 368676}, NULL) = 0
>
> > expr ( 368676 - 276719 ) / 1000.0 us/ms
> > 91.957ms
> > looks like the busyloop runs for the complete time
> > between two messages.
>
> > uwe
>
> Please can you redo the strace with -tt ?
>
> -Alex

Additionally, you can attach to the running Expect process with gdb,
set a breakpoint on gettimeofday, and 'cont'. Then the 'bt' output
should be interesting.

-Alex
From: Uwe Klein on
Alexandre Ferrieux wrote:
> On May 4, 5:09 pm, Uwe Klein <uwe_klein_habertw...(a)t-online.de> wrote:
>
>>Uwe Klein wrote:
>>
>>>gettimeofday({1272981751, 276719}, NULL) = 0
>>>...
>>>~ about _330 similar lines at intervals of ~270us_ snipped.
>>>...
>>>gettimeofday({1272981751, 368676}, NULL) = 0
>>
>>expr ( 368676 - 276719 ) / 1000.0 us/ms
>>91.957ms
>>looks like the busyloop runs for the complete time
>>between two messages.
>>
>>uwe
>
>
> Please can you redo the strace with -tt ?
>
> -Alex
under way.
is your email addy receptive?
I could send a more coherent chunk.


uwe
From: Uwe Klein on
Alexandre Ferrieux wrote:
> On May 4, 6:19 pm, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com>
> wrote:
>
>>On May 4, 5:09 pm, Uwe Klein <uwe_klein_habertw...(a)t-online.de> wrote:
>>
>>
>>>Uwe Klein wrote:
>>>
>>>>gettimeofday({1272981751, 276719}, NULL) = 0
>>>>...
>>>>~ about _330 similar lines at intervals of ~270us_ snipped.
>>>>...
>>>>gettimeofday({1272981751, 368676}, NULL) = 0
>>
>>>expr ( 368676 - 276719 ) / 1000.0 us/ms
>>>91.957ms
>>>looks like the busyloop runs for the complete time
>>>between two messages.
>>
>>>uwe
>>
>>Please can you redo the strace with -tt ?
>>
>>-Alex
>
>
> Additionally, you can attach to the running Expect process with gdb,
> set a breakpoint on gettimeofday, and 'cont'. Then the 'bt' output
> should be interesting.

run as gdb /usr/bin/wish $pid
( expect is per package reuire ... )

(gdb) break gettimeofday
Breakpoint 1 at 0xb74d95e0
(gdb) cont
Continuing.
Breakpoint 1, 0xb74d95e0 in gettimeofday () from /lib/libc.so.6
(gdb) bt
#0 0xb74d95e0 in gettimeofday () from /lib/libc.so.6
#1 0xb76a5dc4 in Tcl_GetTime () from /usr/lib/libtcl8.5.so
#2 0xb75f87df in ?? () from /usr/lib/libtcl8.5.so
#3 0xb75ee3a0 in ?? () from /usr/lib/libtcl8.5.so
#4 0xb75eead0 in Tcl_EvalObjv () from /usr/lib/libtcl8.5.so
#5 0xb7668ebd in ?? () from /usr/lib/libtcl8.5.so
#6 0xb75ee25e in ?? () from /usr/lib/libtcl8.5.so
#7 0xb7634f41 in ?? () from /usr/lib/libtcl8.5.so
#8 0xb767f9a7 in TclObjInterpProcCore () from /usr/lib/libtcl8.5.so
#9 0xb7681264 in TclObjInterpProc () from /usr/lib/libtcl8.5.so
#10 0xb75ee25e in ?? () from /usr/lib/libtcl8.5.so
#11 0xb7634f41 in ?? () from /usr/lib/libtcl8.5.so
#12 0xb7641c73 in ?? () from /usr/lib/libtcl8.5.so
#13 0xb75f103a in TclEvalObjEx () from /usr/lib/libtcl8.5.so
#14 0xb75f92a4 in ?? () from /usr/lib/libtcl8.5.so
#15 0xb75ee25e in ?? () from /usr/lib/libtcl8.5.so
#16 0xb7634f41 in ?? () from /usr/lib/libtcl8.5.so
#17 0xb7641c73 in ?? () from /usr/lib/libtcl8.5.so
#18 0xb75f13ca in TclEvalObjEx () from /usr/lib/libtcl8.5.so
#19 0xb75edb3b in Tcl_EvalObjEx () from /usr/lib/libtcl8.5.so
#20 0xb70f26ca in expMatchProcess ()
from /usr/lib/tcl/expect5.44.1.11/../../libexpect5.44.1.11.so
#21 0xb70f35a0 in exp_background_channelhandler ()
from /usr/lib/tcl/expect5.44.1.11/../../libexpect5.44.1.11.so
#22 0xb764ecb3 in Tcl_NotifyChannel () from /usr/lib/libtcl8.5.so
#23 0xb76a6c6e in ?? () from /usr/lib/libtcl8.5.so
#24 0xb766eb80 in Tcl_ServiceEvent () from /usr/lib/libtcl8.5.so
#25 0xb766ecfc in Tcl_DoOneEvent () from /usr/lib/libtcl8.5.so
#26 0xb77068b7 in Tk_MainLoop () from /usr/lib/libtk8.5.so
#27 0xb771574e in Tk_MainEx () from /usr/lib/libtk8.5.so
#28 0x080488da in main ()
From: Alexandre Ferrieux on
On May 4, 6:36 pm, Uwe Klein <uwe_klein_habertw...(a)t-online.de> wrote:
> Alexandre Ferrieux wrote:
> > On May 4, 5:09 pm, Uwe Klein <uwe_klein_habertw...(a)t-online.de> wrote:
>
> >>Uwe Klein wrote:
>
> >>>gettimeofday({1272981751, 276719}, NULL) = 0
> >>>...
> >>>~ about _330 similar lines at intervals of ~270us_ snipped.
> >>>...
> >>>gettimeofday({1272981751, 368676}, NULL) = 0
>
> >>expr ( 368676 - 276719 ) / 1000.0 us/ms
> >>91.957ms
> >>looks like the busyloop runs for the complete time
> >>between two messages.
>
> >>uwe
>
> > Please can you redo the strace with -tt ?
>
> > -Alex
>
> under way.
> is your email addy receptive?
> I could send a more coherent chunk.
>
> uwe

Thanks. Gathering all hints from both the gdb and strace:

- in the strace, I see ~20ms of busy, no-syscall CPU after each
chunk of ~100ms of runaway gettimeofday
- in the gdb it looks like the Tcl_GetTime doing the gettimeofday
is called by Tcl_Evalobjv

So, I'd say that either the script code does call [clock clicks] or
sthg like that, or the time limit enforcement mechanism is nervously
looking at its watch every other cycle. Can you redo the gdb bt with a
Tcl (and Expect) compiled with -g ? The two bt lines in between
Tcl_EvalObjv and Tcl_GetTime might tell some story...

Still no clue why it's different between your two setups. Could it be
affected by env variables ? SElinux ? ulimit -something ?

-Alex

From: Uwe Klein on
Alexandre Ferrieux wrote:
> Thanks. Gathering all hints from both the gdb and strace:
OK, ( and thanks for your help, had to detatch from my desk for most
of the day for a trip to Hamburg. )
>
> - in the strace, I see ~20ms of busy, no-syscall CPU after each
> chunk of ~100ms of runaway gettimeofday
> - in the gdb it looks like the Tcl_GetTime doing the gettimeofday
> is called by Tcl_Evalobjv
>
> So, I'd say that either the script code does call [clock clicks] or
> sthg like that, or the time limit enforcement mechanism is
> nervously looking at its watch every other cycle.
--^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That, Alex rings the bells of doom for me ;-?

every incoming sentence gets a "Point in Time" tag via [pit]

After every VDO and VDM message ( i.e. when new AIS data on other ships
or "self" is available )

I iterate over the list of "seen" ships and check for
age of the last message from each ship ( if older than 600 seconds
the ship is dumped as out of range )

Then collision parameters are computed for the updated pair(s) ( own <> $other ).

Then the display list for the top 20 most dangerous ones is generated.

these are the relevant procs that do the compare:

namespace eval pit {
variable now 0

proc pit {} {
variable now
set now [ clock seconds ]
clock format $now -format "%Y%m%dT%T" -gmt true
}
proc diff {pitA pitB} {
set secA [ clock scan $pitA -timezone :UTC ]
set secB [ clock scan $pitB -timezone :UTC ]
return [expr {$secB - $secA}]
}
proc age {pit} {
set now [pit]
----------------^^^^^^^^^^^^^
return [ pitdiff $pit $now ]
}
}

At any time about 80 .. 150 ships are visible.
my first step will be to fix my code.
( I don't understand why the same code on a tcl8.4 platform behaves rather differently.
remembering from motorola SysV times: reading a timer used to be just a memory
access and not necessarily a system function???? )

Can you redo the gdb bt with a
> Tcl (and Expect) compiled with -g ? The two bt lines in between
> Tcl_EvalObjv and Tcl_GetTime might tell some story...
I can test it easily on the dual core ( at home ) , the actual target
system is connected via a narrow pipe satcom link with high latency.
( It took ages to get the strace log downloaded ( uploaded? hmm sucked!)
>
> Still no clue why it's different between your two setups. Could it be
> affected by env variables ? SElinux ? ulimit -something ?
I will take some traces from the dual core and from my ancient system.
>
> -Alex
G!
uwe


VDM Voyage data messages : others
VDO Voyage data messages : Own!
AIS ~= VHF Data Transponder for ships.