Re: s6-rc transition failures

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Sun, 18 Jun 2017 16:00:29 +0000

>Yes that would help. I suppose you also mean to wait for the
>service to go down before returning ?

  No, that would open the door to more issues if the finish script is
bad or something like that. Generally speaking you can't do much if a
destructor fails, and there's no possible good error handling in those
cases - destructor failures should be ignored almost all the time.
Not following that rule of thumb leads to, for instance, systemd
refusing to reboot if something bad happens when taking down a service,
which makes sysadmins all over the world laugh, or cry, or both.

  s6-rc now fires and forgets a s6-svc -d in case of a longrun transition
failure. (I didn't make it optional because doing it is always better
than not doing it.) It doesn't wait for down readiness. If the service
has trouble exiting, that's what timeout-kill is for. If the finish
script is bad, that's what timeout-finish is for. If the admin has
disabled those timeouts, tough - there's only so much foolproofing
you can do.


>I was thinking exactly the same :). I even think this could be tailored
>to system shutdown (I do not see another use case). E.g. for ongoing
>longrun up transitions, s6-rc could act as if the transition timed out
>and send "s6-svc -d". For ongoing longrun down transitions I am not
>sure whether it should wait for it to complete or not.

  s6-rc now just kills every longrun transition when it receives a
SIGTERM or SIGINT. This allows it to exit early, no matter what
transition it was stuck in. It does nothing special when a down
transition aborts; but it sends a s6-svc -d when an up transition
aborts. So in all cases, you have a good probability that the
problematic services will be down.

  I'm still unsure if s6-rc should also kill its oneshot subprocesses.
There are arguments for and against it. I haven't done it for now, but
am open to changing that in the future.

  The changes are available in the latest s6-rc git; they're untested.
Please test them and tell me if they work for you.

--
  Laurent
Received on Sun Jun 18 2017 - 16:00:29 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC