Prev: Is there a IPTC and/or XMP tcl package (scripted or binary) out in the wild?
Next: How to profile a Tcl/Tk extension?
From: Alexandre Ferrieux on 14 Jan 2010 11:57 On Jan 14, 3:50 pm, "MartinLemburg(a)Siemens-PLM" <martin.lemburg.siemens-...(a)gmx.net> wrote: > Hi, > > I know ... 1000s of times it is said ... don't care, while scripting > in tcl, about the internal type of the data you are working with. > > But ... I care, because handling bigger structured data (dicts with > nesting dicts, lists (containing dicts, ...), ...) it is not that > nice, if shimmering occurs. > And if producing e.g. a report of a complex, high precision > calculation and its parameters and suddenly all the numerical data > lost its numerical representation? For me a "Ooh NOO". > > So what's about this: > > % info patchlevel > 8.6b1.1 > % proc objtype {arg} {return [string map {"pure" "string"} [lindex > [split [::tcl::unsupported::representation $arg]] 3]];} > % set d [dict create a 1 b 2 c 3] > a 1 b 2 c 3 > % set d2 $d > a 1 b 2 c 3 > % set l [lrepeat 3 $d] > {a 1 b 2 c 3} {a 1 b 2 c 3} {a 1 b 2 c 3} > % objtype $d > dict > % objtype $d2 > dict > % objtype $l > list > % objtype [lindex $l 0] > dict > % format {%30s} $d; # expecting that the string representation of > d will be formatted, ... > a 1 b 2 c 3 > % objtype $d; # ... but the data itself is converted to a string > for formatting > string > % objtype $d2; # even the connection from d to d2 was not cut, > while converting to a string > string > % objtype [lindex $l 0]; # even the values in the list are not > "duplicated" to save their internal type > string > % set d [dict create {*}$d] > a 1 b 2 c 3 > % objtype $d > dict > % objtype $d2; # wow, the connection from d to d2 still existed > while expanding d > list > > So, why "format" converts the data to be formatted instead of using > the string representation? > > Using "format" I never expected that the data I want to format will > shimmer! > I expected the result to be a string, not the format argument to be > converted to a string. The reason is that there are two kinds of "string": the "string rep", which does coexist with any internal rep, and is basically a modified- UTF-8 string with a terminating \0, and the String internal rep, which is an Unicode string. As it is an internal rep, the String obviously erases whatever previous intrep was there. So it all boils down to Tcl_Format requesting its arguments facing '%s' to be first converted to a String object. This happens here: tclStringObj.c, line 1875: numChars = Tcl_GetCharLength(segment); As you can guess, the reason is that the whole [format] concatenation is done on such objects (in unicode). This in turn, I guess, is due to field width specifiers which are expressed in characters, not bytes; Unicode is the realm of character-counting... As a workaround, just don't use [format %s], just use EIAS "naked": set d {1 2 3 4} => 1 2 3 4 dict get $d 1 => 2 puts [format %d 1]$d[format %d 2] => 11 2 3 42 ::tcl::unsupported::representation $d => value is a dict with a refcount of 4, object pointer at 0x97236f8, internal representation 0x9733738:0x97236e0, string representation "1 2 3 4". -Alex
From: Andreas Leitgeb on 14 Jan 2010 13:09 MartinLemburg(a)Siemens-PLM <martin.lemburg.siemens-plm(a)gmx.net> wrote: > so "format {%s} ..." causes the generation of a string representation, > where none is found, and replaces the original internal object > representation by the new generated string representation? Now, I think I understand, and I agree with you, that just querying a string-rep of an object should not "zap" the original rep. But does it really? Why is there a difference between "pure string" and "string" in the output of [rep $d]? % set d [expr double(1)] ;# -> 1.0 % rep $d value is a double with a refcount of 4, object pointer at 0x9c07f08, internal representation (nil):0x3ff00000, string representation "1.0". % set d [expr double(1)]; rep $d value is a double with a refcount of 3, object pointer at 0x9c2b238, internal representation (nil):0x3ff00000, no string representation. % set d [expr double(1)]; format %s $d; rep $d value is a string with a refcount of 3, object pointer at 0x9c2b928, internal representation 0x9c253c8:0x3ff00000, string representation "1.0". This was strange, as you (imho) rightly complained: why make string the new primary rep? But the double is probably still there, as well % set d [expr double(1)]; set d "$d "; rep $d value is a pure string with a refcount of 3, object pointer at 0x9c2b760, string representation "1.0 ". Now, this time it was zapped, but thats fine here. % set d [expr double(1)]; append d "x"; rep $d value is a string with a refcount of 3, object pointer at 0x9c24870, internal representation 0x9c11cc8:0x3ff00000, string representation "1.0x". I'd really want to know, what the old internal rep turned to here...; why doesn't rep name it a "pure string" now, and (nil)ify the secend rep? One really cannot use $d in a numeric expr-operation now (tried it). Glitch in "::tcl::unsupported::representation" or elsewhere? % set d [expr double(1)]; lappend d x; rep $d value is a list with a refcount of 3, object pointer at 0x9c24690, internal representation 0x9c2a9b0:(nil), no string representation. Here, the double rep is really gone. PS: Don't mind the refcounts. I've got Alex's original lastresult patch compiled in (which accounts for the "4" in my first example) and I don't know why it's 3 rather than 2 in the other cases. PPS: info patchlevel == 8.6b1.1 (last updated last week, or so).
From: Alexandre Ferrieux on 14 Jan 2010 17:42 On Jan 14, 7:09 pm, Andreas Leitgeb <a...(a)gamma.logic.tuwien.ac.at> wrote: > > [...] that just querying > a string-rep of an object should not "zap" the original rep. But > does it really? Why is there a difference between "pure string" > and "string" in the output of [rep $d]? [rep] just describes the truth, he's innocent :} See my other post: when rep says "string" it means typePtr==&tclStringType (no invention, the "name" field of tclStringType is "string"). When it says "pure string" it means typePtr==NULL. So, the only things that "zaps" a previous internal rep is _forcing_ to tclStringType. And it happens within [format %s] as explained in my post. Now after digging a bit in the code it appears that: - forcing to tclStringType is indeed useful to count chars - the actual concatenation is done by Tcl_AppendObjToObj, which for all types except two (String and ByteArray-without-string-rep) works at the string-rep level, hence is not responsible for shimmering. As a conclusion I'd say that the String-forcing in [format] is (1) useless in the absence of width specifiers, and (2) might be avoided in all cases, by using (slightly slower) UTF-8-char-counting functions. I'll open a low-prio bug for this. Thanks Martin for unearthing it. > This was strange, as you (imho) rightly complained: why make string the > new primary rep? But the double is probably still there, as well Nope. Just one internal rep at any given time. When it's string the double is gone. (If it were "pure string" also, since pure string means a null type pointer) > % set d [expr double(1)]; append d "x"; rep $d > value is a string with a refcount of 3, object pointer at 0x9c24870, > internal representation 0x9c11cc8:0x3ff00000, string representation "1.0x". > > I'd really want to know, what the old internal rep turned to here...; > why doesn't rep name it a "pure string" now, and (nil)ify the secend rep? Apparently [append] suffers from the same suboptimaliy. Will track it in the same bug report, thanks. > One really cannot use $d in a numeric expr-operation now (tried it). > Glitch in "::tcl::unsupported::representation" or elsewhere? No. 1.0x is hard on anybody's math ;-) And I don't see why it should be rep's fault that "1.0x" doesn't cooperate with [expr]... > PS: Don't mind the refcounts. I've got Alex's original lastresult patch > compiled in (which accounts for the "4" in my first example) and I don't > know why it's 3 rather than 2 in the other cases. Flattered :) 3 == 1 in global var d, 2 in proc unknown's handling of "rep". If you want to see a "1" as a refcount, I'd suggest: - avoiding shortcuts that call [unknown] - avoiding aliases - avoiding variables - avoiding [history lastresult] by appending ";set foo 1" to all lines -Alex
From: Alexandre Ferrieux on 14 Jan 2010 18:03 On Jan 14, 11:42 pm, I wrote: > > I'll open a low-prio bug for this. Thanks Martin for unearthing it. > Done as https://sourceforge.net/tracker/?func=detail&aid=2932421&group_id=10894&atid=110894 -Alex
From: Andreas Leitgeb on 15 Jan 2010 06:14
Alexandre Ferrieux <alexandre.ferrieux(a)gmail.com> wrote: > See my other post: when rep says "string" it means > typePtr==&tclStringType Yes, sorry, I saw that only after writing mine. Most of my comments were kind of voided then, but the one about append is still strange to my understanding. >> % set d [expr double(1)]; append d "x"; rep $d >> value is a string with a refcount of 3, object pointer at 0x9c24870, >> internal representation 0x9c11cc8:0x3ff00000, string representation "1.0x". >> >> I'd really want to know, what the old internal rep turned to here...; >> why doesn't rep name it a "pure string" now, and (nil)ify the secend rep? > > Apparently [append] suffers from the same suboptimaliy. Will track it > in the same bug report, thanks. But it's still all different from the format-case. I'd have expected "append" to destroy all but any of the two string-reps, So, in any case, I'd have expected the ":0x3ff00000" to disappear. That this ":0x3ff00000" was still there was the reason why I even made that goofy expr-test on it. Why is it *not* replaced by ":(nil)", as it happens with other operations that eventually thwart the previous type, like "lappend" ? % set d [expr {atan(1)*4}]; lappend d x; rep $d value is a list with a refcount of 3, object pointer at 0x86528f8, internal representation 0x8651848:(nil), no string representation. >> PS: Don't mind the refcounts. I've got Alex's original lastresult patch >> compiled in (which accounts for the "4" in my first example) and I don't >> know why it's 3 rather than 2 in the other cases. > Flattered :) And I didn't even see that you were participating in this thread when I wrote it :) > If you want to see a "1" as a refcount, I'd suggest: > - ... > - avoiding aliases Wasn't aware that aliases kept another ref, so now it's clear to me. Thanks! (also for the bugreport) |