From: jgodfrey on
I have a fairly simple binary reader proc that exhibits massive speed
differences between Tcl 8.5.8 and Tcl8.6b1.1.

Here's the proc:

proc readFormatted {filename} {
set fd [ open $filename r ]
fconfigure $fd -encoding binary -translation binary

binary scan [ read $fd 2 ] cc type nextlen

if {$type != 75} {
return -error "File is not in the expected format"
}

while {![append buffer [read $fd $nextlen] ; eof $fd]} {
binary scan [read $fd 2] cc lastlen nextlen
# convert to unsigned value
set nextlen [expr {$nextlen & 0xff}]
# 130 is the EOF marker
if {$nextlen == 130} break
# 129 is a special continuation marker (actually only 128
bytes)
if {$nextlen == 129} {set nextlen 128}
}
close $fd

return $buffer
}

Now, the timings on a particularly large file (13.3 MB).

Tcl 8.5.8 - 0.550 secs
Tcl 8.6b1.1 - 14.28 secs

So, 8.6b1 is approx 25x slower than 8.5.8. I know 8.6 is still beta,
but is the above expected at this stage?

Thanks,

Jeff
From: jgodfrey on
On Apr 26, 5:09 pm, jgodfrey <jeffgodfre...(a)gmail.com> wrote:
> Now, the timings on a particularly large file (13.3 MB).
>
> Tcl 8.5.8 - 0.550 secs
> Tcl 8.6b1.1 - 14.28 secs
>
> So, 8.6b1 is approx 25x slower than 8.5.8.  I know 8.6 is still beta,
> but is the above expected at this stage?

I should add the following:

- Both of the above Tcl builds are from ActiveState
- The above timings were recorded using Win7 Pro x64

Jeff
From: Alexandre Ferrieux on
On Apr 27, 12:09 am, jgodfrey <jeffgodfre...(a)gmail.com> wrote:
> I have a fairly simple binary reader proc that exhibits massive speed
> differences between Tcl 8.5.8 and Tcl8.6b1.1.
>
> Here's the proc:
>
> proc readFormatted {filename} {
>      set fd [ open $filename r ]
>      fconfigure $fd -encoding binary -translation binary
>
>      binary scan [ read $fd 2 ] cc type nextlen
>
>      if {$type != 75} {
>         return -error "File is not in the expected format"
>      }
>
>      while {![append buffer [read $fd $nextlen] ; eof $fd]} {
>         binary scan [read $fd 2] cc lastlen nextlen
>         # convert to unsigned value
>         set nextlen [expr {$nextlen & 0xff}]
>         # 130 is the EOF marker
>         if {$nextlen == 130} break
>         # 129 is a special continuation marker (actually only 128
> bytes)
>         if {$nextlen == 129} {set nextlen 128}
>      }
>      close $fd
>
>      return $buffer
>
> }
>
> Now, the timings on a particularly large file (13.3 MB).
>
> Tcl 8.5.8 - 0.550 secs
> Tcl 8.6b1.1 - 14.28 secs
>
> So, 8.6b1 is approx 25x slower than 8.5.8.  I know 8.6 is still beta,
> but is the above expected at this stage?

Interesting. Cursory investigation shows this is not a shimmering
issue, since this usage pattern seems to be keeping byte arrays
without string rep (which are the most efficiently concatenated
values), as told by ::tcl::unsupported::representation.

Maybe something with the quadratic process of endlessly reallocating
of the growing value... though I fail to see why it's new to 8.6.

In any case, please file a bug report.

-Alex

PS:

Also, you might add the timings with alternative [append]'s:

set buffer $buffer[set buffer {}][read ...]

(this form allows for in-place appending, and also tries to stay at
binary level. I say "tries" because it can be spoilt if the {} literal
has been stringified before. A more robust method may involve using
[read $somefd 0] insetad of {}).
From: miguel sofer on
Alexandre Ferrieux wrote:
> (this form allows for in-place appending, and also tries to stay at
> binary level. I say "tries" because it can be spoilt if the {} literal
> has been stringified before. A more robust method may involve using
> [read $somefd 0] insetad of {}).

Uh? The literal {} always has a string rep: tclEmptyStringRep.

If you want a "fresh" non-literal {}, [] might be better.

OTOH, append will work in place if the variable's value is unshared (I
think).
From: Alexandre Ferrieux on
On Apr 27, 2:10 am, miguel sofer <mso...(a)users.sf.net> wrote:
> Alexandre Ferrieux wrote:
> > (this form allows for in-place appending, and also tries to stay at
> > binary level. I say "tries" because it can be spoilt if the {} literal
> > has been stringified before. A more robust method may involve using
> > [read $somefd 0] insetad of {}).
>
> Uh? The literal {} always has a string rep: tclEmptyStringRep.

Yes, but by "stringified" I mean the String obj type, which defeats
shortcuts in the concatenation code about "being already a byte
array". But re-testing, I see that the problem lies only with
[append], not with direct CONCAT1:

# $z is [open /dev/zero r], binary mode
set x {};append x [read $z 10];::tcl::unsupported::representation
$x

->

value is a string with a refcount of 2, object pointer at
0x89805f0, internal representation 0x8985258:0x897fe70, string
representation "..."

But if the variable starts as unset, things are different:

unset x;append x [read $z 10];::tcl::unsupported::representation $x

->

value is a bytearray with a refcount of 2, object pointer at
0x892e518, internal representation 0x8989680:0x892f5c8, no string
representation.


> If you want a "fresh" non-literal {}, [] might be better.

It turns out [] has the same effect as {} in the trials above :/

>
> OTOH, append will work in place if the variable's value is unshared (I
> think).

Yes, and same for CONCAT1 with the K-free K $x[set x {}].

-Alex