From: C.DeRykus on
On Jun 27, 4:45 pm, Ilya Zakharevich <nospam-ab...(a)ilyaz.org> wrote:
> On 2010-06-26, C.DeRykus <dery...(a)gmail.com> wrote:
>
> > On Jun 25, 11:16 pm, Ilya Zakharevich <nospam-ab...(a)ilyaz.org> wrote:
> >> On 2010-06-26, C.DeRykus <dery...(a)gmail.com> wrote:
>
> >> > () = <>;
>
> >> Try to do it with a terabyte file...
>
> > Hm, sounds like I need to look more closely...
>
> > So a humongous temp array gets built with only
> > the resulting  assignment being optimized away...?
>
> Hmm, IN PRINCIPLE, one could have coded recognition of this construct,
> and would somehow advise pp_readline() that its output is going to be
> ignored.  However, given the frequency of this construct, I doubt this
> was ever done.
>
> > perl -MO=Concise -e "()=<>"
> > 8  <@> leave[1 ref] vKP/REFC ->(end)
> > 1     <0> enter ->2
> > 2     <;> nextstate(main 1 -e:1) v:{ ->3
> > 7     <2> aassign[t3] vKS ->8
> > -        <1> ex-list lK ->6
> > 3           <0> pushmark s ->4
> > 5           <1> readline[t2] lK/1 ->6
> > 4              <#> gv[*ARGV] s ->5
> > -        <1> ex-list lK ->7
> > 6           <0> pushmark s ->7
> > -           <0> stub lPRM* ->-
>
> The only way I know to advise an OP is via flags.  So one should
> compare flags on the `readline' OP with those on "usual" list contents
> readline.  If they are identical, there is little chance that this
> construct is memory-optimized.  (But they may differ by "other
> reasons" as well...)
>

Thanks for all the explanations Ilya. As you mention,
this is infrequent. Probably a ton of work to fix too.


--
Charles DeRykus
From: Peter Makholm on
John Kelly <jak(a)isp2dial.com> writes:

> This code reads STDIN and remembers the first non-empty line. That's
> all it cares about.
>
> But it also keeps reading till EOF, acting like the "cat" utility, to
> flush the extra input and avoid broken pipe errors.
>
> But reading line by line, just to throw away the unwanted garbage, is
> inefficient. I would like to jump out of the loop and "bulk flush" the
> remaining input stream.

If you input stream is a terminal on a posix system you can use the
tcflush() function.

tcflush(0, TCIFLUSH)
or warn "Couldn't flush stdin: $!";

This of course only works if stdin is a terminal and not a pipe from
some other program. This might not work on non-posix systems, this
might not work for you specific need.

This works for me:

#!/usr/bin/perl

use strict;
use warnings;

use POSIX;

my $data = '';

sleep 5;

while (<>) {
chomp;
/^\s*$/ and next;
$data = $_;
print "data=\"$data\"\n";
last;
}

tcflush(0, TCIFLUSH)
or warn "Couldn't flush stdin: $!";

<> or die "1 EOF\n";
<> or die "2 EOF\n";
<> or die "3 EOF\n";
<> or die "4 EOF\n";
<> or die "5 EOF\n";
<> or die "6 EOF\n";
<> or die "7 EOF\n";

__END__
From: Ilya Zakharevich on
On 2010-06-28, John Kelly <jak(a)isp2dial.com> wrote:
> On Sun, 27 Jun 2010 23:53:10 +0000 (UTC), Ilya Zakharevich
><nospam-abuse(a)ilyaz.org> wrote:
>
>>I had the same problem as the OP: avoiding SIGPIPE
>>on the OTHER side of the pipe.
>
>>So I think it is not wise to expect that the core of Perl would be
>>able to help with this problem. I would just do $/ = (1<<20), and do
>>a loop.

\(1<<20), of course

> I went with the loop.
>
> (1<<20) is a 1 meg of memory. (1<<15) may run nearly as fast on a large
> file (untested).

AFAIU, we are talking about one context switch (plus small change) per
N input characters. I do not know the price of context switch on
current hardware; 20 years ago it was about 200-300 cycles, and I
suspect it is more today.

Assuming 1cycle/char to generate the pipe output, and
1000cycles/switch, the slowdown of (1<<15) would be noticable (3%).
These assumptions are plausible, but on the "pessimistic side"; so you
might be right. Nevertheless, today the price of memory is not large;
this is why I had chosen \(1<<20).

Hope this helps,
Ilya
From: John Kelly on
On Mon, 28 Jun 2010 18:34:57 +0000 (UTC), Ilya Zakharevich
<nospam-abuse(a)ilyaz.org> wrote:

>On 2010-06-28, John Kelly <jak(a)isp2dial.com> wrote:
>> On Sun, 27 Jun 2010 23:53:10 +0000 (UTC), Ilya Zakharevich
>><nospam-abuse(a)ilyaz.org> wrote:
>>
>>>I had the same problem as the OP: avoiding SIGPIPE
>>>on the OTHER side of the pipe.
>>
>>>So I think it is not wise to expect that the core of Perl would be
>>>able to help with this problem. I would just do $/ = (1<<20), and do
>>>a loop.
>
> \(1<<20), of course
>
>> I went with the loop.
>>
>> (1<<20) is a 1 meg of memory. (1<<15) may run nearly as fast on a large
>> file (untested).
>
>AFAIU, we are talking about one context switch (plus small change) per
>N input characters. I do not know the price of context switch on
>current hardware; 20 years ago it was about 200-300 cycles, and I
>suspect it is more today.
>
>Assuming 1cycle/char to generate the pipe output, and
>1000cycles/switch, the slowdown of (1<<15) would be noticable (3%).

OK. But a 3% difference seems small to me.


>These assumptions are plausible, but on the "pessimistic side"; so you
>might be right. Nevertheless, today the price of memory is not large

A gig of RAM on my PC is cheap. But I still like to conserve it. Just
old fashioned I guess.



--
Web mail, POP3, and SMTP
http://www.beewyz.com/freeaccounts.php

From: Big and Blue on
> John Kelly<jak(a)isp2dial.com> writes:
>
>> This code reads STDIN and remembers the first non-empty line. That's
>> all it cares about.
>>
>> But it also keeps reading till EOF, acting like the "cat" utility, to
>> flush the extra input and avoid broken pipe errors.
>>
>> But reading line by line, just to throw away the unwanted garbage, is
>> inefficient. I would like to jump out of the loop and "bulk flush" the
>> remaining input stream.

exec a fork() of cat >/dev/null?

--
Just because I've written it doesn't mean that
either you or I have to believe it.