s6: service startup notifications

Service startup notifications

It is easy for a process supervision suite to know when a service that was up is now down: the long-lived process implementing the service is dead. The supervisor, running as the daemon's parent, is instantly notified via a SIGCHLD. When it happens, s6-supervise sends a 'd' event to its ./event fifodir, so every subscriber knows that the service is down. All is well.

It is much trickier for a process supervision suite to know when a service that was down is now up. The supervisor forks and execs the daemon, and knows when the exec has succeeded; but after that point, it's all up to the daemon itself. Some daemons do a lot of initialization work before they're actually ready to serve, and it is impossible for the supervisor to know exactly when the service is really ready. s6-supervise sends a 'u' event to its ./event fifodir when it successfully spawns the daemon, but any subscriber reacting to 'u' is subject to a race condition - the service provided by the daemon may not be ready yet.

Reliable startup notifications need support from the daemons themselves. Daemons should do two things to signal the outside world that they are ready:

Update a state file, so other processes can get a snapshot of the daemon's state
Send an event to processes waiting for a state change.

This is complex to implement in every single daemon, so s6 provides tools to make it easier for daemon authors, without any need to link against the s6 library or use any s6-specific construct: daemons can simply write a line to a file descriptor of their choice, then close that file descriptor, when they're ready to serve. This is a generic mechanism that some daemons already implement.

s6 supports that mechanism natively: when the service directory for the daemon contains a valid notification-fd file, the daemon's supervisor, i.e. the s6-supervise program, will properly catch the daemon's message, update the status file (supervise/status), then notify all the subscribers with a 'U' event, meaning that the service is now up and ready.

This method should really be implemented in every long-running program providing a service. When it is not the case, it's impossible to provide reliable startup notifications, and subscribers should then be content with the unreliable 'u' events provided by s6-supervise.

Unfortunately, a lot of long-running programs do not offer that functionality; instead, they provide a way to poll them, an external program that runs and checks whether the service is ready. This is a bad mechanism, for several reasons. Nevertheless, until all daemons are patched to notify their own readiness, s6 provides a way to run such a check program to poll for readiness, and route its result into the s6 notification system: s6-notifyoncheck.

How to use a check program with s6 (i.e. readiness checking via polling)

Let's say you have a daemon foo, started under s6 via a /run/service/foo service directory, and that comes with a foo-check program that exhibits different behaviours when foo is ready and when it is not.
Create an executable script /run/service/foo/data/check that calls foo-check. Make sure this script exits 0 when foo is ready and nonzero when it's not.
In your /run/service/foo/run script that starts foo, instead of executing into foo, execute into s6-notifyoncheck foo. Read the s6-notifyoncheck page if you need to give it options to tune the polling.
echo 3 > /run/service/foo/notification-fd. If file descriptor 3 is already open when your run script executes foo, replace 3 with a file descriptor you know is not already open.
That's it.
- Your check script will be automatically invoked by s6-notifyoncheck, until it succeeds.
- s6-notifyoncheck will send the readiness notification to the file descriptor given in the notification-fd file.
- s6-supervise will receive it and will mark foo as ready.

How to design a daemon so it uses the s6 mechanism without resorting to polling (i.e. readiness notification)

The s6-notifyoncheck mechanism was made to accommodate daemons that provide a check program but do not notify readiness themselves; it works, but is suboptimal. If you are writing the foo daemon, here is how you can make things better:

Readiness notification should be optional, so you should guard all the following with a run-time option to foo.
Assume a file descriptor other than 0, 1 or 2 is going to be open. You can hardcode 3 (or 4); or you can make it configurable via a command line option. See for instance the -D notif option to the mdevd program. It really doesn't matter what this number is; the important thing is that your daemon knows that this fd is already open, and is not using it for another purpose.
Do nothing with this file descriptor until your daemon is ready.
When your daemon is ready, write a newline to this file descriptor.
- If you like, you may write other data before the newline, just in case it is printed to the terminal. It is not necessary, and it is best to keep that data short. If the line is read by s6-supervise, it will be entirely ignored; only the newline is important.
Then close that file descriptor.

The user who then makes foo run under s6 just has to do the following:

Write 3, or the file descriptor the foo daemon uses to notify readiness, to the /run/service/foo/notification-fd file.
In the /run/service/foo/run script, invoke foo with the option that activates the readiness notification. If foo makes the notification fd configurable, the user needs to make sure that the number that is given to this option is the same as the number that is written in the notification-fd file.
And that is all. Do not use s6-notifyoncheck in this case, because you do not need to poll to know whether foo is ready; instead, foo will directly communicate its readiness to s6-supervise, and that is a much more efficient mechanism.

What does s6-supervise do with this readiness information?

s6-supervise maintains a readiness state for other programs to read. You can check for it, for instance, via the s6-svstat program.
s6-supervise also broadcasts the readiness event to programs that are waiting for it - for instance the s6-svwait program. This can be used to make sure that other programs only start when the daemon is ready. For instance, the s6-rc service manager uses that mechanism to bring sets of services up or down: a service starts as soon as all its dependencies are ready, but never earlier.