From: Mark Morgan on
Hello,
This is related to my previous post "another newbie question, simple
string operation?" My question changed quite a bit over the last few
days, so I thought I should reframe the whole thing.

My basic issue is I've got a master text file that has the text of
other filenames scattered in it. I'm trying to substitute the
contents of the other files into the place where the filename is in
the master text file. Everything is in the same directory.

The filename and the surrounding text that will need to be replaced is
this (quoted): "{@INPUTFILE example.txt}". The problem I'm having is
that I can't get it to substitute all of that text, all of the time.
Specifically if the file example.txt contains (quoted): "Example:
300-45823" then it won't replace it. If it has "{Example: 300-45823}
then it will.

Here's the code and I'll explain more afterwards (There's a program
called Shorthand for Windows written in TCL and that's where some of
the application specific verbage will be coming from.):

set file [open $filename r];
set data [read $file];
close $file;

foreach textspiel [regexp -all -inline {((\{@INPUTFILE)\s+\w+(.txt
\}))} $data] {
set intspiel [string length $textspiel]
if {$intspiel > 15} {
set txtposition [string first ".txt" $textspiel];
set textfilenameend [expr {$txtposition + 3}];
set textfilename [string range $textspiel 11
$textfilenameend];
set textfilenametrimmed [string trimleft $textfilename];
# check whether the file exists
if {[file exists $textfilenametrimmed]} {
# read in the contents of the file; add them to the
map
set textfile [open $textfilenametrimmed];
set contents [read $textfile];
close $textfile;
regsub -all {$textspiel} $data {$contents} data;
sh_input msg "" "$textspiel AND $contents";
} else {
set filenotfound "{file not found}";
regsub -all $textspiel $data $filenotfound data;
}
}
}
set file [open $filename w];
puts $file $data;
close $file;


As you can see, I did have to bandaid a few if's in there. I'd be
happy if anyone has any general suggestions on this code, too.

But, I think it must have to do with the brackets. It seems like when
the text file that's going to be substituted into the master file
contains bracketed text then the substitution goes forward. If not,
regexp finds it and sends it down through the code but regsub won't
substitute it.

Thanks in advance for your thoughts/suggestions.

Sincerely,
Mark
From: Donal K. Fellows on
On 21 Dec, 12:28, Mark Morgan <me.mor...(a)yahoo.com> wrote:
> The filename and the surrounding text that will need to be replaced is
> this (quoted):  "{@INPUTFILE example.txt}".  The problem I'm having is
> that I can't get it to substitute all of that text, all of the time.
> Specifically if the file example.txt contains (quoted): "Example:
> 300-45823" then it won't replace it.  If it has "{Example:  300-45823}
> then it will.

Starting out by trying to understand the requirements here. You have a
file whose contents includes sequences of the form:
{@INPUTFILE FOOBAR.txt}
and each of those sequences needs to be replaced by the contents of
the file with the given name? Assuming that's so, and that there's no
other quoting to do, then the method is this:

proc processTemplate string {
# This is exactly the replacement to make a string [subst]-safe
set s [string map {$ \\$ \[ \\\[ \\ \\\\} $string]
# Now convert the replacements to embedded commands
regsub -all {{@INPUTFILE (\w+\.txt)}} $s {[readFromFile \1]} s
# Process all the substitutions
return [subst $s]
}
# Simple read-a-file helper
proc readFromFile filename {
set f [open $filename]
set d [read $f]
close $f
return $d
# Use this instead if you want recursive template processing:
# return [processTemplate $d]
}

The use of [string map], [regsub] and [subst] is not as intuitive as
it ought to be. There probably ought to be a -eval or -command option
to [regsub] so that the rest of that stuff can be avoided, but it's
not been implemented yet (it's slightly tricky to make the syntax work
perfectly so that it doesn't clunk, so it's not so far had a high
enough priority for the people doing the Tcl implementation to work
on).

One thing to note about this code. It's a lot simpler than yours.
Tricks like this are why it is useful to ask here (or look on the
Wiki: http://wiki.tcl.tk) when you're having problems.

Donal.
From: Jonathan Bromley on
On Mon, 21 Dec 2009 06:06:08 -0800 (PST), "Donal K. Fellows" wrote:

[snip nice solution]

>One thing to note about this code. It's a lot simpler than yours.
>Tricks like this are why it is useful to ask here

I was going to post a somewhat different solution myself but
Donal got there first... but his [subst]-based solution raises
some really interesting questions for me as a sometime teacher
and trainer.

Using [subst] on data is obviously convenient and powerful,
but it has always troubled me somewhat. For example:

- Unlike just about everything else in Tcl, [subst] just
works the way it works and there's not much you can do
to modify its behaviour. That's fine if it does exactly
what you need, but I worry about flexibility. Stuff like
include-file insertion is quite likely to need detailed,
context-dependent intervention: for example, what should
happen if the last character of an included file is
(or is not) a line break?

- The preparatory wardance
set s [string map {$ \\$ \[ \\\[ \\ \\\\} $string]
frightens me a lot. It's a piece of user code that mirrors
the operation of some Tcl internals. Am I alone in finding
that somewhat distasteful?

And finally, although the solution is neat and instructive,
its relationship to the original requirements is not obvious
to anyone who is not highly Tcl-savvy.

None of this is complaint or criticism. Rather, I guess,
it's an open invitation to help me readjust my attitudes :-)
--
Jonathan Bromley
From: Donal K. Fellows on
On 21/12/2009 15:26, Jonathan Bromley wrote:
> Using [subst] on data is obviously convenient and powerful,
> but it has always troubled me somewhat. For example:

It's magical. I can remember being thoroughly startled the first time I
saw that sort of thing going on too. :-) But it works, and is both fast
and safe.

> - Unlike just about everything else in Tcl, [subst] just
> works the way it works and there's not much you can do
> to modify its behaviour. That's fine if it does exactly
> what you need, but I worry about flexibility. Stuff like
> include-file insertion is quite likely to need detailed,
> context-dependent intervention: for example, what should
> happen if the last character of an included file is
> (or is not) a line break?

Well that's entirely up to how you go about writing both the [regsub] to
make the command substitution producing the string to process, and what
those command substitutions do. In this case, I'm using a very
simple-minded model; I'm sure you can come up with more sophisticated
ones. But in summary, there are three steps:

1. Defang; [string map] makes this easy.
2. Put in the interesting substitutions.
3. Splat through [subst].

You can reduce the amount of quoting needed in step #1 by passing extra
options in step #3 (e.g., I could have not quoted '$' characters if I'd
passed the -novariables option to [subst]). But it's easy enough to
handle all three cases.

> - The preparatory wardance
> set s [string map {$ \\$ \[ \\\[ \\ \\\\} $string]
> frightens me a lot. It's a piece of user code that mirrors
> the operation of some Tcl internals. Am I alone in finding
> that somewhat distasteful?

OK, that just puts a backslash in front of all Tcl's in-double-quotes
metacharacters. Really. An alternative would have been:

regsub -all {[[\\$]} $string {\\&} s

But that's slower and just as magical. :-)

> And finally, although the solution is neat and instructive,
> its relationship to the original requirements is not obvious
> to anyone who is not highly Tcl-savvy.

We know it ought to be more elegant and obvious than this; it's on our
todo list. Maybe next year in Tcl 8.7...?

Donal.
From: Jonathan Bromley on
On Mon, 21 Dec 2009 15:42:25 +0000, "Donal K. Fellows" wrote:

>It's magical. I can remember being thoroughly startled the first time I
>saw that sort of thing going on too. :-) But it works, and is both fast
>and safe.

Understood.

> 1. Defang; [string map] makes this easy.
> 2. Put in the interesting substitutions.
> 3. Splat through [subst].

Nice summary, thanks.

>You can reduce the amount of quoting needed in step #1 by passing extra
>options in step #3 (e.g., I could have not quoted '$' characters if I'd
>passed the -novariables option to [subst]). But it's easy enough to
>handle all three cases.

Right, it seems pointless to do only some of them if one single,
simple recipe will handle the whole lot.

>> - The preparatory wardance
>> set s [string map {$ \\$ \[ \\\[ \\ \\\\} $string]
>> frightens me a lot. It's a piece of user code that mirrors
>> the operation of some Tcl internals. Am I alone in finding
>> that somewhat distasteful?
>
>OK, that just puts a backslash in front of all Tcl's in-double-quotes
>metacharacters. Really.

Yes, I'm aware of that. But a beginner surely would have a hard
time being confident that the set was complete. So it could easily
degenerate into a piece of voodoo, handed down by cut'n'paste from
one project to another, until its original purpose was lost....
Actually I would have thought an encapsulation of that would be
a useful addition to the repertoire: [string unsubst] ??

>> And finally, although the solution is neat and instructive,
>> its relationship to the original requirements is not obvious
>> to anyone who is not highly Tcl-savvy.
>
>We know it ought to be more elegant and obvious than this; it's on our
>todo list. Maybe next year in Tcl 8.7...?

No, I wasn't criticising Tcl's facilities; I was questioning whether
it's good, especially for beginners or occasional users, to apply
techniques that are so many steps away from the original spec.
Even if it's a tad inefficient, pedestrian step-by-step implementation
of such requirements is sometimes a good investment for future
comprehensibility.

Thanks for the response.
--
Jonathan Bromley
 |  Next  |  Last
Pages: 1 2 3 4
Prev: Silent wrapping
Next: tcl on multicore, what is the plan