Re: s6: something like runit's ./check script from Laurent Bercot on 2015-09-03 (supervision)

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Thu, 3 Sep 2015 20:23:23 +0200

On 03/09/2015 18:25, Buck Evan wrote:
> An s6-checkhelper wrapper that implements exactly the above would make me
> happy enough.

  Yes, that's envisionable. I'll think about it.

> if a ./check exists, the framework does the polling for me.

  The thing is, the command that does the polling is "sv check".
There is currently no equivalent in s6, because the status file,
readable via s6-svstat, is supposed to always have an accurate
view of the state of the service.
  I could implement a "s6-svcheck" command that would just run
the ./check script a few times, without interacting with the
status file or the notification mechanism at all; but that's
just a 2-line script loop around ./check, so I never felt there
was a point in writing an actual binary doing that.
  Is that what you want? If it is, I guess I can do it - it's
not like it takes time to write.

> I think it would be implemented to read the notification-fd file and the
> (new) timeout-start files and do the Right Thing.

  No, a "s6-svcheck" command would mirror "sv check" without interacting
with the notification system. (And there's no such thing as timeout-start:
there's only timeout-finish, and that's for the ./finish script. :))

  On the other hand, a "s6-checkwrapper" command, to be used in the ./run
script, would poll ./check at service startup time, in order to wait until
the service is ready, and inject the result into the readiness notification
system, so dependent programs can use "s6-svwait -uwU" as if the service
had native notification abilities. It would stop polling once readiness has
been reported.

  Those are two different functionalities, which one do you want: s6-svcheck?
s6-checkwrapper? both?

> I'd probably define a default value for notification-fd (3?) but if you
> want to error out when it doesn't exist and check does exist, that's fine
> too.

  ./notification-fd just tells s6-supervise to listen to a readiness
notification newline from the run script. It's unrelated to the presence
of ./check. If ./check exists and ./notification-fd doesn't, it just
means your run script can't use s6-checkwrapper and doesn't provide
readiness notification; you can still poll the service by running ./check.

> If a service has a ./check script, I'll populate a thisservice-heartbeat
> sub-service.

  Don't automate that: some services may provide a ./check for occasional
polling without wanting a heartbeat monitor all day long.

> I'll write '3' to notification-fd if it doesn't exist.

  Unless you're going to use s6's readiness notification with
s6-svwait -uwU or something of the kind, forget about notification-fd.

> The thisservice-heartbeat will run ./check at some interval and send
> notification to notification-fd when it succeeds.
> (Will sending many multiple up-notifications hurt anything?)

  It simply won't work, because ./notification-fd is only the number
of a file descriptor made available by s6-supervise for its ./run
script. You can't access that descriptor outside of thisservice/run.
But again, when you have a watchdog that stays there all the time,
forget s6's notification mechanism: just rely on the watchdog's
output.

> If ./check fails, I want to notify s6 that the service is no longer 'up',
> and put it into a state where it will be restarted.
> I'm not sure how I will do that bit.

  Anything wrong with "s6-svc -t /service/thisservice" when the heartbeat
fails?

-- 
  Laurent

Received on Thu Sep 03 2015 - 18:23:23 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC