Re: s6-supervise: use of nosetsid from Steve Litt on 2020-12-03 (supervision)

From: Steve Litt <slitt_at_troubleshooters.com>
Date: Thu, 3 Dec 2020 14:53:07 -0500

On Thu, 03 Dec 2020 16:46:58 +0000
"Laurent Bercot" <ska-supervision_at_skarnet.org> wrote:

> Hello,
>
> The next version of s6 will be a major bump, with a few long-awaited
> QoL changes - mainly a thorough cleanup of how s6-svscan handles
> signals and the various commands sent by s6-svscanctl, but also some
> goodies that you should like. :)
>
> One issue that has been often reported by users is that when they
> try running s6-svscan in a terminal, and then ^C to kill it, the
> services remain running. This is intentional, because supervision
> suites are designed to isolate processes from anything accidental that
> could bring them down, and in particular services should normally
> survive supervisor death - but so far there has been many more
> instances of people having trouble with that behaviour than instances
> of s6-supervise accidentally dying.
>
> I have previously added the "nosetsid" feature to s6-supervise, to
> address the issue: having a "nosetsid" file in a service directory
> prevents the service from being started as a session leader, it starts
> in the same session as the supervision tree (and, if the nosetsid file
> is empty, in the same process group). So when people want to manually
> test a supervision tree, they can have nosetsid files in their test
> service directories, and ^C will send a SIGINT to all the processes
> including the services, so everything will die, which is what they
> want.
>
> There are two problems with the nosetsid approach:
>
> - Oftentimes, users are not aware of the existence of nosetsid, and
> still experience the issue. It's almost an s6 FAQ at this point.
> - The nosetsid functionality is inherently a risk: it puts the
> whole supervision tree at the mercy of a misbehaved service that would
> send a signal to its whole process group. There is a reason why
> s6-supervise normally starts services in a different session, and
> nosetsid bypasses that safety measure.
>
> So I am thinking of another approach to make s6 friendlier to users
> who would - despite it not being recommended behaviour - test a
> supervision tree in a terminal: have s6-supervise handle SIGINT and
> make it kill its service before exiting. That would ensure that ^C
> cleans up everything.
>
> This approach has the drawback of making services a little less
> resilient, but s6-supervise getting a SIGINT should *only* happen in
> the case of someone running a supervision tree in a terminal, which
> is absolutely not something that should exist in production, so it's
> probably not a big concern. However, it comes with a major advantage:
> it removes the original reason for the addition of nosetsid.
> So, with the change to ^C handling, I am considering removing the
> dangerous nosetsid functionality entirely.
>
> Hence, my question to users: do you have a *valid* reason to use
> nosetsid files in your service directories? Are there use cases for
> nosetsid that I have not thought about, and that would make using s6
> impractical if the functionality were to be removed?
>
> Thanks in advance for your input.

I have no reason for nosetsid files.

I'm a big fan of keeping things the same and keeping them simple. What
happens, in the current s6, if somebody first Ctrl+C's s6-svscan from
the terminal, and then issues a pkill s6-supervise, with whatever
signal will kill a s6-supervise instance?

If the actual daemons survive the death of their individual s6-supervise
supervisors after the pkill, then yes, you could modify s6-supervise to
kill the daemon they're supervising. You could even make it an option by
having a certain filename turn that behavior off, if people want that.

SteveT

Steve Litt
Autumn 2020 featured book: Thriving in Tough Times
http://www.troubleshooters.com/thrive
Received on Thu Dec 03 2020 - 19:53:07 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC