On Fri, Aug 14, 2015 at 7:48 AM, Laurent Bercot <ska-skaware_at_skarnet.org> wrote:
> First, the only time this makes a qualitative difference is when
> the pipe maintainer cannot die at all. In one setup, you lose your
> pipe when "s6-svscan" dies; in the other setup, you lose your pipes
> when "s6-fdholderd" dies. The only way to prevent that is to forbid
> your pipe maintainer from dying entirely.
>
> Second, the only way to do that is to put the pipe maintainer as
> process 1; but I don't think putting things in process 1 to make
> them indestructible is the answer. It's the systemd way. "We're
> process 1, so we cannot die, and we can do everything on the system
> that needs reliability."
> Granted, it's a nice thing to have, and I do advocate the use of
> s6-svscan as process 1, but not because it's a pipe maintainer. I
> use s6-svscan as process 1 because it's the natural place for the
> root of a supervision tree; and everything else is a bonus.
>
Regardless of the process 1/not process 1 location of your supervision
tree root, it's more about not introducing a state where your
supervision suite can go totally vegetable without any outside
indication until later. And yeah, the supervision-root-as-pipe-root
solution is the easy way around that issue, it isn't however the only
or necessarily the right way to do it. That solution does however move
the burdon of complexity to implementation as opposed to design.
>
> So, let's make sure it's not a problem when the pipe maintainer
> dies. In this case, let's add a watcher for s6-fdholderd.
> Instead of oneshots that store pipes into the s6-fdholderd, how
> about filling up s6-fdholderd at start time with all the pipes
> it needs ? The processes in a pipeline will keep using the old
> pipes until one of them dies, at which point the old pipe will
> close, propagating the EOF or EPIPE to the other processes in
> the pipeline; eventually all the processes in the pipeline will
> restart, and fetch the new set of pipes from s6-fdholderd.
>
Short of persisting some state so that a new fdholderd can reconnect
to the old pipe set with the correct names (not sure if that's even
possible) this is probably the best solution. I assume that the
watcher will take the form of extra code in the run script to populate
the fdholderd once it's running and not a separate service.
>
> That sounds reliable to me, and even cleaner than the current
> approach, where the services can't reliably restart if
> s6-fdholderd has died; and it doesn't need additional
> autogenerated oneshots. (Thanks for the rubber duck debugging!
> That's a huge part of why I like design discussions.)
>
That seems safe enough. Like I said before, it's unlikely that
s6rc-fdholder is going to restart, but accidents happen and the result
is a disaster. Especially since everything happens in the domain of
the supervisor, so as far as the rc system is concerned everything is
fine.
>
> So yeah, if s6-fdholderd dies, and one process in a pipeline
> dies, then the whole pipeline will restart. I think it's an
> acceptable price to pay, and it's the best we can do without
> involving process 1.
>
Probably safe. I suggest a big warning label though :)
Cheers!
--
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
-- William Blake
Received on Fri Aug 14 2015 - 16:48:02 UTC