From: loial on
I need to read a large amount of data that is being returned in
standard output by a shell script I am calling.

(I think the script should really be writing to a file but I have no
control over that)

Currently I have the following code. It seeems to work, however I
suspect this may not work with large amounts of standard output.

What is the best way to read a large amount of data from standard
output and write to a file?

Here is my code.

process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

cmdoutput=process.communicate()

myfile = open('/home/john/myoutputfile','w')

myfile.write(cmdoutput[0])

myfile.close()

From: Gabriel Genellina on
En Fri, 06 Aug 2010 06:06:29 -0300, loial <jldunn2000(a)gmail.com> escribi�:

> I need to read a large amount of data that is being returned in
> standard output by a shell script I am calling.
>
> (I think the script should really be writing to a file but I have no
> control over that)
>
> Currently I have the following code. It seeems to work, however I
> suspect this may not work with large amounts of standard output.
>
> What is the best way to read a large amount of data from standard
> output and write to a file?
>
> Here is my code.
>
> process=subprocess.Popen(['myscript', 'param1'],
> shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>
> cmdoutput=process.communicate()
>
> myfile = open('/home/john/myoutputfile','w')
>
> myfile.write(cmdoutput[0])
>
> myfile.close()


If all you do with the process' output is to write it to the output file,
you can avoid the intermediate step:


myfile = open('/home/john/myoutputfile','w')
myerror = open('/home/john/myerrorfile','w')
process=subprocess.Popen(['myscript', 'param1'],
shell=False,stdout=myfile,stderr=myerror)
process.wait()

(untested)

--
Gabriel Genellina

From: Nobody on
On Fri, 06 Aug 2010 02:06:29 -0700, loial wrote:

> I need to read a large amount of data that is being returned in
> standard output by a shell script I am calling.
>
> (I think the script should really be writing to a file but I have no
> control over that)

If the script is writing to stdout, you get to decide whether its stdout
is a pipe, file, tty, etc.

> Currently I have the following code. It seeems to work, however I
> suspect this may not work with large amounts of standard output.

> process=subprocess.Popen(['myscript', 'param1'],
> shell=False,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>
> cmdoutput=process.communicate()

It's certainly not the best way to read large amounts of output.
Unfortunately, better solutions get complicated when you need to read more
than one of stdout and stderr, or if you also need to write to stdin.

If you only need stdout, you can just read from process.stdout in a loop.
You can leave stderr going to wherever the script's stderr goes (e.g. the
terminal), or redirect it to a file.

If you really do need both stdout and stderr, then you either need to
enable non-blocking I/O, or use a separate thread for each stream, or
redirect at least one of them to a file.

FWIW, Popen.communicate() uses non-blocking I/O on Unix and separate
threads on Windows (the standard library doesn't include a mechanism to
enable non-blocking I/O on Windows).

> What is the best way to read a large amount of data from standard
> output and write to a file?

For this case, the best way is to just redirect stdout to a file, rather
than passing it through the script, i.e.:

outfile = open('outputfile', 'w')
process = subprocess.call(..., stdout = outfile)
outfile.close()