From: Paul on 10 Jan 2010 10:09 Hello, I have a question regarding Tcl binary strings. If running the following script creating two binary strings: set num 50 for { set j 0 } { $j < $num } { incr j } { append row1 [binary format c 0] } puts "row1: length=[string length $row1] bytelength=[string bytelength $row1]" for { set j 0 } { $j < $num } { incr j } { append row2 [binary format c 1] } puts "row2: length=[string length $row2] bytelength=[string bytelength $row2]" I get the following output (tested with 8.4, 8.5, 8.6): row1: length=50 bytelength=100 row2: length=50 bytelength=50 Why do zero values occupy 2 bytes in a binary string? Regards, Paul
From: slebetman on 10 Jan 2010 10:31 On Jan 10, 11:09 pm, "Paul(a)Tcl3D" <p...(a)tcl3d.org> wrote: > Hello, > > I have a question regarding Tcl binary strings. > > If running the following script creating two binary strings: > > set num 50 > > for { set j 0 } { $j < $num } { incr j } { > append row1 [binary format c 0]} > > puts "row1: length=[string length $row1] bytelength=[string bytelength > $row1]" > > for { set j 0 } { $j < $num } { incr j } { > append row2 [binary format c 1]} > > puts "row2: length=[string length $row2] bytelength=[string bytelength > $row2]" > > I get the following output (tested with 8.4, 8.5, 8.6): > > row1: length=50 bytelength=100 > row2: length=50 bytelength=50 > > Why do zero values occupy 2 bytes in a binary string? > Don't use [string bytelength]. What you want is [string length]. Because Tcl is implemented in C and because in C, strings are terminated by nul (0x00), the tcl interpreter internally encodes nuls as a special two-byte character. The [string bytelength] is really there mainly for debugging purposes or to workaround any possible edge cases not automatically handled by tcl. For everything else use [string length].
From: Alexandre Ferrieux on 10 Jan 2010 11:58 On Jan 10, 4:09 pm, "Paul(a)Tcl3D" <p...(a)tcl3d.org> wrote: > Hello, > > I have a question regarding Tcl binary strings. > > If running the following script creating two binary strings: > > set num 50 > > for { set j 0 } { $j < $num } { incr j } { > append row1 [binary format c 0]} > > puts "row1: length=[string length $row1] bytelength=[string bytelength > $row1]" > > for { set j 0 } { $j < $num } { incr j } { > append row2 [binary format c 1]} > > puts "row2: length=[string length $row2] bytelength=[string bytelength > $row2]" > > I get the following output (tested with 8.4, 8.5, 8.6): > > row1: length=50 bytelength=100 > row2: length=50 bytelength=50 > > Why do zero values occupy 2 bytes in a binary string? > > Regards, > > Paul Slebetman is perfectly right ; may I ask why you need [string bytelength] ? Ar you aware that it is *not* what you want event when preparing a write to an utf-8 encoded output channel, since it is basically measuring a "special flavour" of UTF-8 that is entirely internal to Tcl ? -Alex
From: slebetman on 10 Jan 2010 21:45 On Jan 11, 12:58 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> wrote: > On Jan 10, 4:09 pm, "Paul(a)Tcl3D" <p...(a)tcl3d.org> wrote: > > > > > Hello, > > > I have a question regarding Tcl binary strings. > > > If running the following script creating two binary strings: > > > set num 50 > > > for { set j 0 } { $j < $num } { incr j } { > > append row1 [binary format c 0]} > > > puts "row1: length=[string length $row1] bytelength=[string bytelength > > $row1]" > > > for { set j 0 } { $j < $num } { incr j } { > > append row2 [binary format c 1]} > > > puts "row2: length=[string length $row2] bytelength=[string bytelength > > $row2]" > > > I get the following output (tested with 8.4, 8.5, 8.6): > > > row1: length=50 bytelength=100 > > row2: length=50 bytelength=50 > > > Why do zero values occupy 2 bytes in a binary string? > > > Regards, > > > Paul > > Slebetman is perfectly right ; may I ask why you need [string > bytelength] ? Ar you aware that it is *not* what you want event when > preparing a write to an utf-8 encoded output channel, since it is > basically measuring a "special flavour" of UTF-8 that is entirely > internal to Tcl ? > I think both the documentation and the error message generated when calling [string bytelength] without arguments should state: DO NOT USE THIS, see string length instead.
From: Arjen Markus on 11 Jan 2010 01:25 On 10 jan, 16:31, "slebet...(a)yahoo.com" <slebet...(a)gmail.com> wrote: > On Jan 10, 11:09 pm, "Paul(a)Tcl3D" <p...(a)tcl3d.org> wrote: > > > > > > > Hello, > > > I have a question regarding Tcl binary strings. > > > If running the following script creating two binary strings: > > > set num 50 > > > for { set j 0 } { $j < $num } { incr j } { > > append row1 [binary format c 0]} > > > puts "row1: length=[string length $row1] bytelength=[string bytelength > > $row1]" > > > for { set j 0 } { $j < $num } { incr j } { > > append row2 [binary format c 1]} > > > puts "row2: length=[string length $row2] bytelength=[string bytelength > > $row2]" > > > I get the following output (tested with 8.4, 8.5, 8.6): > > > row1: length=50 bytelength=100 > > row2: length=50 bytelength=50 > > > Why do zero values occupy 2 bytes in a binary string? > > Don't use [string bytelength]. What you want is [string length]. > > Because Tcl is implemented in C and because in C, strings are > terminated by nul (0x00), the tcl interpreter internally encodes nuls > as a special two-byte character. The [string bytelength] is really > there mainly for debugging purposes or to workaround any possible edge > cases not automatically handled by tcl. For everything else use > [string length]. The reason is not so much that C uses NUL bytes to terminate strings, but that Tcl uses UTF-8 internally. With "counted strings" there is no need for this extra memory, but it is the UTF-8 encoding of NUL bytes. Regards, Arjen
|
Next
|
Last
Pages: 1 2 Prev: can androgel cause estrogen levels to increase Next: Tkhtml2.0 source code for 64bit |