bulk flush input [Perl]

Prev: FAQ 3.14 How can I write a GUI (X, Tk, Gtk, etc.) in Perl?
Next: FAQ 1.13 Is it a Perl program or a Perl script?

From: C.DeRykus on 28 Jun 2010 01:11

On Jun 27, 4:45 pm, Ilya Zakharevich <nospam-ab...(a)ilyaz.org> wrote:
> On 2010-06-26, C.DeRykus <dery...(a)gmail.com> wrote:
>
> > On Jun 25, 11:16 pm, Ilya Zakharevich <nospam-ab...(a)ilyaz.org> wrote:
> >> On 2010-06-26, C.DeRykus <dery...(a)gmail.com> wrote:
>
> >> > () = <>;
>
> >> Try to do it with a terabyte file...
>
> > Hm, sounds like I need to look more closely...
>
> > So a humongous temp array gets built with only
> > the resulting assignment being optimized away...?
>
> Hmm, IN PRINCIPLE, one could have coded recognition of this construct,
> and would somehow advise pp_readline() that its output is going to be
> ignored. However, given the frequency of this construct, I doubt this
> was ever done.
>
> > perl -MO=Concise -e "()=<>"
> > 8 <@> leave[1 ref] vKP/REFC ->(end)
> > 1 <0> enter ->2
> > 2 <;> nextstate(main 1 -e:1) v:{ ->3
> > 7 <2> aassign[t3] vKS ->8
> > - <1> ex-list lK ->6
> > 3 <0> pushmark s ->4
> > 5 <1> readline[t2] lK/1 ->6
> > 4 <#> gv[*ARGV] s ->5
> > - <1> ex-list lK ->7
> > 6 <0> pushmark s ->7
> > - <0> stub lPRM* ->-
>
> The only way I know to advise an OP is via flags. So one should
> compare flags on the `readline' OP with those on "usual" list contents
> readline. If they are identical, there is little chance that this
> construct is memory-optimized. (But they may differ by "other
> reasons" as well...)
>

Thanks for all the explanations Ilya. As you mention,
this is infrequent. Probably a ton of work to fix too.

--
Charles DeRykus

From: Peter Makholm on 28 Jun 2010 03:06

John Kelly <jak(a)isp2dial.com> writes:

> This code reads STDIN and remembers the first non-empty line. That's
> all it cares about.
>
> But it also keeps reading till EOF, acting like the "cat" utility, to
> flush the extra input and avoid broken pipe errors.
>
> But reading line by line, just to throw away the unwanted garbage, is
> inefficient. I would like to jump out of the loop and "bulk flush" the
> remaining input stream.

If you input stream is a terminal on a posix system you can use the
tcflush() function.

tcflush(0, TCIFLUSH)
or warn "Couldn't flush stdin: $!";

This of course only works if stdin is a terminal and not a pipe from
some other program. This might not work on non-posix systems, this
might not work for you specific need.

This works for me:

#!/usr/bin/perl

use strict;
use warnings;

use POSIX;

my $data = '';

sleep 5;

while (<>) {
chomp;
/^\s*$/ and next;
$data = $_;
print "data=\"$data\"\n";
last;
}

tcflush(0, TCIFLUSH)
or warn "Couldn't flush stdin: $!";

<> or die "1 EOF\n";
<> or die "2 EOF\n";
<> or die "3 EOF\n";
<> or die "4 EOF\n";
<> or die "5 EOF\n";
<> or die "6 EOF\n";
<> or die "7 EOF\n";

__END__

From: Ilya Zakharevich on 28 Jun 2010 14:34

On 2010-06-28, John Kelly <jak(a)isp2dial.com> wrote:
> On Sun, 27 Jun 2010 23:53:10 +0000 (UTC), Ilya Zakharevich
><nospam-abuse(a)ilyaz.org> wrote:
>
>>I had the same problem as the OP: avoiding SIGPIPE
>>on the OTHER side of the pipe.
>
>>So I think it is not wise to expect that the core of Perl would be
>>able to help with this problem. I would just do $/ = (1<<20), and do
>>a loop.

\(1<<20), of course

> I went with the loop.
>
> (1<<20) is a 1 meg of memory. (1<<15) may run nearly as fast on a large
> file (untested).

AFAIU, we are talking about one context switch (plus small change) per
N input characters. I do not know the price of context switch on
current hardware; 20 years ago it was about 200-300 cycles, and I
suspect it is more today.

Assuming 1cycle/char to generate the pipe output, and
1000cycles/switch, the slowdown of (1<<15) would be noticable (3%).
These assumptions are plausible, but on the "pessimistic side"; so you
might be right. Nevertheless, today the price of memory is not large;
this is why I had chosen \(1<<20).

Hope this helps,
Ilya

From: John Kelly on 28 Jun 2010 14:47

On Mon, 28 Jun 2010 18:34:57 +0000 (UTC), Ilya Zakharevich
<nospam-abuse(a)ilyaz.org> wrote:

>On 2010-06-28, John Kelly <jak(a)isp2dial.com> wrote:
>> On Sun, 27 Jun 2010 23:53:10 +0000 (UTC), Ilya Zakharevich
>><nospam-abuse(a)ilyaz.org> wrote:
>>
>>>I had the same problem as the OP: avoiding SIGPIPE
>>>on the OTHER side of the pipe.
>>
>>>So I think it is not wise to expect that the core of Perl would be
>>>able to help with this problem. I would just do $/ = (1<<20), and do
>>>a loop.
>
> \(1<<20), of course
>
>> I went with the loop.
>>
>> (1<<20) is a 1 meg of memory. (1<<15) may run nearly as fast on a large
>> file (untested).
>
>AFAIU, we are talking about one context switch (plus small change) per
>N input characters. I do not know the price of context switch on
>current hardware; 20 years ago it was about 200-300 cycles, and I
>suspect it is more today.
>
>Assuming 1cycle/char to generate the pipe output, and
>1000cycles/switch, the slowdown of (1<<15) would be noticable (3%).

OK. But a 3% difference seems small to me.

>These assumptions are plausible, but on the "pessimistic side"; so you
>might be right. Nevertheless, today the price of memory is not large

A gig of RAM on my PC is cheap. But I still like to conserve it. Just
old fashioned I guess.

--
Web mail, POP3, and SMTP
http://www.beewyz.com/freeaccounts.php

From: Big and Blue on 28 Jun 2010 18:39

> John Kelly<jak(a)isp2dial.com> writes:
>
>> This code reads STDIN and remembers the first non-empty line. That's
>> all it cares about.
>>
>> But it also keeps reading till EOF, acting like the "cat" utility, to
>> flush the extra input and avoid broken pipe errors.
>>
>> But reading line by line, just to throw away the unwanted garbage, is
>> inefficient. I would like to jump out of the loop and "bulk flush" the
>> remaining input stream.

exec a fork() of cat >/dev/null?

--
Just because I've written it doesn't mean that
either you or I have to believe it.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: FAQ 3.14 How can I write a GUI (X, Tk, Gtk, etc.) in Perl?
Next: FAQ 1.13 Is it a Perl program or a Perl script?