"un-meta" the control characters [Perl]

Prev: unintialised warning
Next: optimizar r/w en ficheros

From: John W. Krahn on 2 Nov 2009 15:54

Paul Lalli wrote:
> A coworker just presented me with this task. I came up with two
> solutions, but I don't like either of them. He has a text document
> and wants to scan it for characters such as newline, tab, form feed,
> carriage return, vertical tab. If found, he wants to replace them
> with their typical representation (ie, \n, \t, \f, \r, \v).
>
> I first gave him the obvious:
> $string =~ s/\n/\\n/;
> $string =~ s/\t/\\t/;
> $string =~ s/\f/\\f/;
> $string =~ s/\r/\\r/;
> $string =~ s/\v/\\v/;

Perl doesn't have a "\v" character:

$string =~ s/\cK/\\v/;

Or:

$string =~ s/\13/\\v/;

Or:

$string =~ s/\xB/\\v/;

John
--
The programmer is fighting against the two most
destructive forces in the universe: entropy and
human stupidity. -- Damian Conway

From: Ben Morrow on 2 Nov 2009 17:52

Quoth Paul Lalli <mritty(a)gmail.com>:
> A coworker just presented me with this task. I came up with two
> solutions, but I don't like either of them. He has a text document
> and wants to scan it for characters such as newline, tab, form feed,
> carriage return, vertical tab. If found, he wants to replace them
> with their typical representation (ie, \n, \t, \f, \r, \v).
>
> I first gave him the obvious:
> $string =~ s/\n/\\n/;
> $string =~ s/\t/\\t/;
> $string =~ s/\f/\\f/;
> $string =~ s/\r/\\r/;
> $string =~ s/\v/\\v/;
>
> which I don't like because of how much copy/paste is involved. Then I
> came up with:
>
> for (qw/n t f r v/) {
> my $meta = eval("\\$_");
> $string =~ s/$meta/\\$_/;
> }
>
> which I don't like, because the comment he'd have to put in the code
> to explain it would be longer than the code itself, or the first
> version.
>
> So can anyone think of a better way? Is there any kind of intrinsic
> link between a newline character and the letter 'n' that could be used
> to go "backwards" here?

There is no intrinsic link. The only way to determine that 'n' in the
letter for "\n" is using eval, and as you say anything using eval comes
out more complicated than a simple list of substitutions.

While Uri's solution (using a table of replacements) is nicely general,
in this case (with relatively few replacements, and no conflict between
the replaced and the replacing strings) I would just write

for ($string) {
s,\n,\\n,g;
s,\t,\\t,g;
s,\f,\\f,g;
s,\r,\\r,g;
s,\cK,\\v,g;
}

(and curse, once again, the fact that I can't use 'given' in place of
'for' there).

Ben

From: C.DeRykus on 3 Nov 2009 12:48

On Nov 2, 11:07 am, Paul Lalli <mri...(a)gmail.com> wrote:
> A coworker just presented me with this task. I came up with two
> solutions, but I don't like either of them. He has a text document
> and wants to scan it for characters such as newline, tab, form feed,
> carriage return, vertical tab. If found, he wants to replace them
> with their typical representation (ie, \n, \t, \f, \r, \v).
>
> I first gave him the obvious:
> $string =~ s/\n/\\n/;
> $string =~ s/\t/\\t/;
> $string =~ s/\f/\\f/;
> $string =~ s/\r/\\r/;
> $string =~ s/\v/\\v/;
>
> which I don't like because of how much copy/paste is involved. Then I
> came up with:
>
> for (qw/n t f r v/) {
> my $meta = eval("\\$_");
> $string =~ s/$meta/\\$_/;
>
> }
> ...

Did that work? I don't understand why the eval is needed
at all:

my $string = "1\n 2\t 3\f 4\r 5\cK";
for (qw/n t f r cK/) {
my $meta = "\\$_";
$string =~ s/$meta/\\$_/;
}
print $string; # 1\n 2\t 3\f 4\r 5\cK

--
Charles DeRykus

From: Ben Morrow on 3 Nov 2009 14:18

Quoth "C.DeRykus" <derykus(a)gmail.com>:
> On Nov 2, 11:07�am, Paul Lalli <mri...(a)gmail.com> wrote:
> >
> > for (qw/n t f r v/) {
> > � �my $meta = eval("\\$_");
> > � �$string =~ s/$meta/\\$_/;
> >
> > }
>
> Did that work? I don't understand why the eval is needed
> at all:
>
> my $string = "1\n 2\t 3\f 4\r 5\cK";
> for (qw/n t f r cK/) {
> my $meta = "\\$_";
> $string =~ s/$meta/\\$_/;
> }
> print $string; # 1\n 2\t 3\f 4\r 5\cK

That's... evil. It relies on the fact that regexes undergo two separate
expansion phases, and requires that variable expansion happens in the
first phase but other qqish escapes are expanded in the second. I'm not
entirely convinced that's documented behaviour: anyone care to dig out
perlre and prove it one way or the other?

For extra added evil:

my $bs = "\\";
$string =~ s/$bs$_/$bs$_/g for qw/n r t f/;

Ben

From: Randal L. Schwartz on 3 Nov 2009 15:00

>>>>> "Ben" == Ben Morrow <ben(a)morrow.me.uk> writes:

Ben> For extra added evil:

Ben> my $bs = "\\";
Ben> $string =~ s/$bs$_/$bs$_/g for qw/n r t f/;

And I thought *I* was being bad.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn(a)stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion

First | Prev | Next | Last
Pages: 1 2 3
Prev: unintialised warning
Next: optimizar r/w en ficheros