Re: subreapers

From: Martin \ <>
Date: Mon, 12 Dec 2016 17:00:19 +0100

On Sun, 11 Dec 2016 18:00:46 +0000
Jonathan de Boyne Pollard <> wrote:

> What M. Misuth is doing is the most imaginative use of local reapers
> that I have come across.

Thanks, it's sometimes hard to defend this idea. Many people see it as
"wacky" for milion reasons. I am not really persuaded my implementation is
proper, after all I am just wacky experimenter.

> What I wrote in the nosh doco back in version
> 1.0 was:

> > This yields a slightly more informative process tree.
> This was presented as a mere side-effect by Poettering and Sievers in
> 2012. The main idea was a rather vague, and in the end not implemented,
> notion that user instances of systemd could somehow make use of the fact
> that they were made the parents of orphaned grandchildren processes in
> order to Do Stuff. In fact, it is the mere side-effect that turns out
> in practice to be the major benefit.
> You should not underestimate how useful the effects on the process tree
> are. All service manager instances in nosh are local reapers, and one
> sees the effects with the system-wide service manager as well as with
> per-user service managers. One doesn't see such a pronounced effect
> with the system-wide service manager because, as I have noted elsewhere
> in this thread, as a result of the pressure from the daemontools world
> for the past two decades the world already makes the process tree of
> system-wide service management fairly well organized. Lots of daemons
> do not fork-and-exit-parent any more, and orphaned grandchildren simply
> do not arise as much in this area as they used to at the turn of the
> century.

This is indeed correct. In my observation, it misses one important corner case

  - Many daemons allow to be "extended" by configuration files, or whatever
    other means, to spawn various workers.

I think this doesn't concern many people here, as they are pretty open and anal
about simply not running such stuff. It is somehwat similar to openbsd response
to jails and containers being crap - which boils down to "just fix your shitty daemons".

Yes that is indeed proper solution. Let's see how this worked out : it failed
more often than not. As already discussed it took some years to get
"foreground" flag even.

At the same time, we have somehow to make do. Sometimes these "process
swarm" (as I call them) "spawnages" are really unexpected.

For example for some reason I found in one of my play session, that X in
non-root user mode in one of my Linux experiments spawns "pkexec" which
immediately dies. It fails to reap it as well. All I know is that pkexec is yet
another bad sudo clone, policykit based one (I pray I got that right). I am
probably missing some config. Or package. Or some other part of magic sauce.

Sometimes, these "extension" workers, are configurable themselves,
to spawn some "sub-workers", and these "sub-workers" might be written
by local admin, and spawn yet some other levels, and we get really nice
and deep tree. We are told processes are light and there is nothing wrong
with that as well (fact which I came to appreciate greatly and agree with)

Now in this situation supervisor does correct thing - it sends SIGTERM for
signal termination, to it's child. Properly, forgerounded daemon does right
thing as well, and propagates signal down the tree to it's workers (those it
knows about at least). But somewhere "deeper" there was "shell out" into sh
script which ignores SIGTERM (no trap). Local admin didn't know he can do that.
Up to that point everything is shutting down proeprly, but this process
branch stays running as if nothing happened (locking files and such for
example). Thus "service" actually "broke".

You won't ever know that service broke though. Everything looks fine, yet
offender was "teleported" to the top under init and lingers there. On process
heavy machine you might not even spot it.

I am as guilty as many others, and uintentionally subjected myself to this,
then scratched my head for hours.

You will realise something is wrong only when you start given
service anew and somethings goes haywire. Or not. Who knows.

With subreaping leader, at least this, doesn't happen.
Subreaper will certainly know that it still has some children.

This is not traditonal unix thing but it meshes with basic primitives
surprisingly well. It beats cgroups and other tools as conceptually
simpler by large margin.

I am guessing this might have been original intention in dragonfly.

Yet there also are cases, when you actually want to reparent leftover
processes "higher". "Host level" sshd is great example. Sometimes you want
however sshd (in certain container) to term/kill all the children it has.
It could be extended to do that, but what about other daemons - should they be
extended as well?

How many upstreams will even "care"? That is often "site + [container +]
service" specific decision.

> But the effect for per-user stuff is marked.
> This is because the world of per-user stuff includes "desktop services",
> like the roughly ten servers that have to be started up in order to run
> the "small and lightweight" GNOME Editor. This world is still replete
> with things that fork-and-exit-parent. To give another example: The
> PCDM startup of X desktop environments on TrueOS (formerly PC-BSD)
> starts up a whole bunch of user processes via fork-and-exit-parent.
> These all end up in a different part of the process tree to the desktop
> processes sitting under the top-of-desktop-session process, under
> process #1 or the nearest local reaper. In the output of "ps -dax", all
> of these processes are scattered all over the shop.

Exactly. More over same issue with unexpected "workers" happens.

With jails this gets even more complex. There is parent-child relation
and there is process-jail relation. Of course you can filter by jid/jail name
with ps but why bother, when you can ask kernel to keep it all together?
Reparenting happens outside of the process scope. Subreaper won't even wake
when it acquires new child. If jail is run "standard" way, however, processes
will "scatter" in the process table. When jail is run with supervisor instead,
it all starts to make much more sense. If both host level and in-jail
supervisors are kept on same version, you can even control services
across host>jail boundary very comfortably.

Supervisor leads the whole jail and within it's process subtree, everything
sticks together ...

... until doublefork. "Offender" again ends up under init.

Subreaper minimizes this somewhat. Only things injected by "rambo admin"
(term I really like) will be out of jail's process subtree.
He/she better knows what he/she is doing when doing that.

> One can improve this subtree with broken branches all over the forest
> floor with a local reaper.
> The benefit of local reapers is not a programmatic one for the likes of
> you and me. It is a usability one for administrators and end users
> trying to follow their process trees.

This sums it up nicely.
Received on Mon Dec 12 2016 - 16:00:19 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC