Prev: including pygments code_block directive in rst2* from docutils
Next: ANN: ActivePython 2.7.0c1.0 is now available
From: Chris Seberino on 10 Jun 2010 11:40 On Jun 10, 6:52 am, Nobody <nob...(a)nowhere.com> wrote: > Without the p1.stdout.close(), if the reader (grep) terminates before > consuming all of its input, the writer (ls) won't terminate so long as > Python retains the descriptor corresponding to p1.stdout. In this > situation, the p1.wait() will deadlock. > > The communicate() method wait()s for the process to terminate. Other > processes need to be wait()ed on explicitly, otherwise you end up with > "zombies" (labelled "<defunct>" in the output from "ps"). You are obviously very wise on such things. I'm curious if this deadlock issue is a rare event since I'm grep (hopefully) would rarely terminate before consuming all its input. Even if zombies are created, they will eventually get dealt with my OS w/o any user intervention needed right? I'm just trying to verify the naive solution of not worrying about these deadlock will still be ok and handled adequately by os. :) cs
From: Lie Ryan on 10 Jun 2010 11:42
On 06/10/10 21:52, Nobody wrote: > Spawning child processes to perform tasks > which can easily be performed in Python is inefficient Not necessarily so, recently I wrote a script which takes a blink of an eye when I pipe through cat/grep to prefilter the lines before doing further complex filtering in python; however when I eliminated the cat/grep subprocess and rewrite it in pure python, what was done in a blink of an eye turns into ~8 seconds (not much to fetter around, but it shows that using subprocess can be faster). I eventually optimized a couple of things and reduced it to ~1.5 seconds, up to which, I stopped since to go even faster would require reading by larger chunks, something which I don't really want to do. The task was to take a directory of ~10 files, each containing thousands of short lines (~5-10 chars per line on average) and count the number of lines which match a certain criteria, a very typical script job, however the overhead of reading the files line-by-line in pure python can be straining (you can read in larger chunks, but that's not the point, eliminating grep may not come for free). |