Re: How to recover from s6-rc broken pipe?

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Wed, 16 Dec 2020 20:11:12 +0000

>I'm using s6-rc to manage services and have been changing databases.
>
>For some unknown reason sometimes the update fails with the error:
>
>s6-rc-update: fatal: unable to read /run/s6-rc/state: Broken pipe

  That should definitely not happen.
  Have your databases been built with the same version of s6-rc as
the one you're using? (Normally they're compatible, but there has
been an incompatible change between 0.3 and 0.4)


>When that happens, I cannot use s6-rc anymore:
>
>/run/s6-rc does not exists, but s6-rc declares it as if it does:

  s6-rc-update removes the /run/s6-rc symlinks when it fails? If it
does, it's a bug that I will fix for the next release.


>s6-rc-init: fatal: unable to supervise service directories in
>/run/s6-rc/servicedirs: File exists
>
>Creating the s6-rc symlink does not improve the situation.
>
>How should I recover from this error?

  It is possible that you have a bunch of dangling symlinks in
/run/service, that were pointing to your old live directory, are not
valid anymore, but are still preventing s6-rc from doing its job.

  Generally speaking, s6-rc-update failing is bad news, because it is
difficult to do the proper cleanups (either automatically when
s6-rc-update fails, because some operations cannot be rolled back, or
manually afterwards), so yeah, your scandir may be in an ugly state
and you may need to remove all the symlinks there, delete all the
/run/s6-rc* directories and/or symlinks, and restart from scratch.
Depending on the changes your oneshots made on your system, you may
get error when running them again, too, so in the worst case, the
only good option might be to reboot. Sorry.

  But really, the original cause of the problem should not happen, and
s6-rc-update should not be failing like this. If it does not happen
all the time, is something overwriting your state file? or removing
the /run/s6-rc symlink? If your databases are compatible, then there
is definitely some external interference here.

--
  Laurent
Received on Wed Dec 16 2020 - 20:11:12 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC