From: Joshua Maurice on
Suppose I'm writing a build system, possibly built upon GNU Make.
Developers using this build system can write near arbitrary build
commands. I do not know all possible build steps as new build steps
may be added in the future, and I do not have source code control over
all child, grandchildren, etc, as they will be using code not written
in house, like gcc, javac, etc.

We have a computer which waits for a checkin to source control, gets
latest, then kicks off a build and tests. Sometimes these tests leave
orphaned processes which hold onto file handles and fail. Sometimes
the tests just hang. Sometimes, these processes hang onto file
descriptors which make the next build fail. Ideally, we would like a
programmatic way to kill all processes spawned directly or indirectly
by GNU Make.

Process groups are insufficient. They cannot be nested. The tests of
our product may use process groups to facilitate some logic, and thus
the build and tests cannot be contained inside a single process
group.

The only solution I see is to reach for a larger gun and spawn each
build under its own user name so we could programmatically identify
all such processes to kill them. This is just reaching for a bigger
gun, a bigger "process group". Sorry for devolving down into a rant,
but this seems like basic functionality, the ability to kill a process
tree which contains processes whose source code is not under my
control or which may not play nice, but it seems as though this is not
possible.

As a developer working on the product using the build system, I would
also like this functionality. Otherwise, I have to ps after each build
and test to see if I have any random processes chilling in the
background which may be holding onto file descriptors, memory, etc.

Am I missing anything?
From: Chris Friesen on
On 04/16/2010 03:48 PM, Joshua Maurice wrote:

> We have a computer which waits for a checkin to source control, gets
> latest, then kicks off a build and tests. Sometimes these tests leave
> orphaned processes which hold onto file handles and fail. Sometimes
> the tests just hang. Sometimes, these processes hang onto file
> descriptors which make the next build fail. Ideally, we would like a
> programmatic way to kill all processes spawned directly or indirectly
> by GNU Make.

If your OS supports it, you could do the make in a new process
namespace. Then when you're done with it you just terminate the whole
namespace. Linux supports this as of 2.6.24, though I haven't played
with it yet.

Namespaces nest, so even if the test code uses them it will still work.

Chris
From: Ersek, Laszlo on
On Fri, 16 Apr 2010, Joshua Maurice wrote:

> Ideally, we would like a programmatic way to kill all processes spawned
> directly or indirectly by GNU Make.

Start the root make like this:

$ LEGACY=$(mktemp)
$ make ... 77<$LEGACY

Then:

$ while fuser -s -k $LEGACY; do sleep 1; done

I'm not sure if the loop is necessary, but it seems more robust. Not
because I don't trust a single SIGKILL to be enough for any process, but
because I suspect that fuser doesn't lock the process table, and what if a
process forks a child, just before getting killed, but *after* fuser has
examined and killed some other processes? The newly forked child may not
show up in the set of processes fuser examines. A race against a malicious
fork bomb seems unpromising, but the loop should protect against such
sporadic (unintentional) races.

You might want to replace -s with -v, so as to see if the loop is ever of
actual use. You might want to add -TERM too.

Cheers,
lacos
From: Ersek, Laszlo on
On Fri, 16 Apr 2010, Joshua Maurice wrote:

> Ideally, we would like a programmatic way to kill all processes spawned
> directly or indirectly by GNU Make.
>
> Process groups are insufficient. They cannot be nested. The tests of our
> product may use process groups to facilitate some logic, and thus the
> build and tests cannot be contained inside a single process group.
>
> The only solution I see is to reach for a larger gun and spawn each
> build under its own user name so we could programmatically identify all
> such processes to kill them. This is just reaching for a bigger gun, a
> bigger "process group".

You could run each test sequence in a separate VM, starting from a
pristine image. That way you could even experiment with cross-uid test
cases.

Cheers,
lacos