>I have a problem process that's getting stuck somewhere in its network
>code (clearly the right answer is to find/fix the problem, but it's
>infrequent and we're up against time constraints...)
>
>I'm looking at just adding a wrapper around it which does something
>along the lines of the existing s6 notification - have the supervised
>process write a regular character to an inherited fd, if it ever stops
>then the wrapper kills everyone and exits allowing the supervision to
>restart it.
What you want is process monitoring. s6 doesn't do that, because
the monitoring needs are very process-specific and it's impossible to
predict all the functionality that every daemon under the sun could
want to use.
Here, you need a heartbeat. You can implement the heartbeat monitor
as a separate service, which it is: it's a service that reads the
heartbeat from your daemon and sends it an s6-svc -r command when
it fails to receive the heartbeat from a period of time.
If you don't need anything more than that, you can probably write
the heartbeat monitor in shell (or even in execline!) It's a call
to mkfifo then a read loop. :)
--
Laurent
Received on Sat Nov 20 2021 - 22:57:32 CET