OpenRC now supports daemon supervision using s6

From: Guillermo <gdiazhartusch_at_gmail.com>
Date: Sat, 25 Jul 2015 16:43:28 -0300

Hello,

I was recently looking at OpenRC's source tarball to see how it
implemented something, and was surprised by this little discovery:

https://gitweb.gentoo.org/proj/openrc.git/plain/s6-guide.md?h=openrc-0.16.x

Yup. Since version 0.16, OpenRC can launch supervised daemons using
s6, as an alternative to (its implementation of) start-stop-daemon(8).
To use this feature, both an OpenRC init script and an s6 service
directory have to be provided. OpenRC init scripts aren't exactly
shell scripts, they are interpreted by openrc-run(8), and their format
is specified in that command's man page. The structure of s6 service
directories is documented here:

http://www.skarnet.org/software/s6/servicedir.html

The first document above details what to do in the init script. Most
notably, turn the feature on by including a "supervisor=s6"
assignment, and specify a "need" dependency on the (OpenRC-provided)
"s6-svscan" service, to make sure the s6-svscan program is started if
it's not running.

Init scripts must be placed wherever OpenRC was configured at build
time to find them, normally /etc/init.d. Service directories can be
placed anywhere, as long as it's on a writable filesystem, and OpenRC
is told its full path by assigning it to the s6_service_path variable
in the init script. The filesystem has to be writable because OpenRC
creates a symlink to the provided servicedir in scan directory
/run/openrc/s6-scan (yeah, without the "v"). The s6_service_path
assignment can be omitted if the servicedir has the same name as the
init script and is placed under /var/svc.d.

One can specify options for the s6-svwait command by assigning them to
the s6_svwait_options_start variable, so, for example, if the daemon
supports s6-compatible readiness notification, the service directory
is written to make use of it, and the init script includes an
s6_svwait_options_start=-U or s6_svwait_options_start="-U -t 2000"
assignment, then rc-service <service name> start will block until the
service has notified it is ready, or, in the second example, if 2
seconds have elapsed without receiving the notification.

On the down side, s6-svscan itself is started with start-stop-daemon,
which by default redirects stdout and stderr to /dev/null. This means
the supervision tree logs will be lost, so, currently, to at least be
able to keep the supervised daemon's logs, it must have a dedicated
logger, and a log/ subdirectory in its service directory.

Cheers!
G.
====== Made-up example using OpenRC version 0.16.4 and s6 version 2.1.6.0 ======

OK, so we need an OpenRC init script. Check.

$ cat /etc/init.d/lazy-daemon
#!/sbin/openrc-run
name="Lazy daemon"
description="A daemon that does nothing."

# Turn on process supervision
supervisor=s6

# Tell OpenRC the location of the service directory
# Not necessary if it is /var/svc.d/lazy-daemon
s6_service_path=/home/test/openrc/lazy-daemon

# Block until the service is ready
s6_svwait_options_start=-U

depend() {
  need s6-svscan
}

$ sudo rc-service lazy-daemon describe
 * A daemon that does nothing.
 * cgroup_cleanup: Kill all processes in the cgroup

And an s6 service directory. Check.

$ ls -l /home/test/openrc/lazy-daemon
total 12
drwxr-xr-x 2 test test 4096 Jul 23 21:53 log
-rw-r--r-- 1 test test 2 Jul 23 20:08 notification-fd
-rwxr-xr-x 1 test test 357 Jul 24 20:23 run

$ cat /home/test/openrc/lazy-daemon/run
#!/bin/execlineb -P
# Let's not do this as root
s6-setuidgid daemon
fdmove -c 2 1
foreground { echo Service started }

# Take 10 seconds to make the bed
foreground { sleep 10 }

# We're ready, tell s6-supervise. We use FD 5. Because.
foreground {
  fdmove 1 5
  printf "\n"
}
fdclose 5
foreground { echo Service ready }

# Take a nap, long enough for our tests
sleep 120

$ cat /home/test/openrc/lazy-daemon/notification-fd
5

$ ls -l /home/test/openrc/lazy-daemon/log
total 4
-rwxr-xr-x 1 test test 160 Jul 24 20:24 run

$ cat /home/test/openrc/lazy-daemon/log/run
#!/bin/execlineb -P
# Let's not do this as root
s6-setuidgid daemon
umask 044
s6-log n1 s10240 t /home/test/openrc/lazy-daemon-logs

So let's start this thing:

$ time sudo rc-service lazy-daemon start
 * Creating s6 scan directory
 * /run/openrc/s6-scan: creating directory
 * Starting s6-svscan ... [ ok ]
 * Starting Lazy daemon ... [ ok ]

real 0m13.747s
user 0m0.471s
sys 0m1.009s

OK, that took more than 10 seconds. Looks like OpenRC really waited
for the daemon to be ready. And now there are two new services:

$ rc-status -a
Runlevel: sysinit
[...]
Dynamic Runlevel: needed
[...]
 s6-svscan [ started ]
[...]
Dynamic Runlevel: manual
 lazy-daemon [ started ]

And a supervision tree:

$ pstree -Apa
init,1
[...]
  |-s6-svscan,1709 /run/openrc/s6-scan
  | |-s6-supervise,1728 lazy-daemon/log
  | | `-s6-log,1731 n1 s10240 t /home/test/openrc/lazy-daemon-logs
  | `-s6-supervise,1729 lazy-daemon
  | `-sleep,1732 120
[...]

And a scan directory:

$ ls -l /run/openrc/s6-scan
total 0
lrwxrwxrwx 1 root root 47 Jul 25 12:24 lazy-daemon ->
/home/test/openrc/lazy-daemon

S6 says the service is up now:

$ sudo s6-svstat /home/test/openrc/lazy-daemon
up (pid 1732) 23 seconds

And OpenRC also knows, by asking the supervisor:

$ sudo rc-service lazy-daemon status
up (pid 1732) 24 seconds

Alas, no supervision tree logs:

$ sudo ls -l /proc/1709/fd
total 0
lrwx------ 1 root root 64 Jul 25 12:24 0 -> /dev/null
lrwx------ 1 root root 64 Jul 25 12:24 1 -> /dev/null
lrwx------ 1 root root 64 Jul 25 12:24 2 -> /dev/null
[...]

But hey, at least we have the daemon's logs:

$ ls -l /home/test/openrc/lazy-daemon-logs
total 4
-rw-r--r-- 1 daemon daemon 164 Jul 25 12:26 current
-rw--w--w- 1 daemon daemon 0 Jul 25 12:24 lock
-rw--w--w- 1 daemon daemon 0 Jul 25 12:24 state

$ cat /home/test/openrc/lazy-daemon-logs/current | s6-tai64nlocal
2015-07-25 12:26:21.289361885 Service started
2015-07-25 12:26:31.549368351 Service ready

OK, enough. Let's end this:

$ sudo rc-service lazy-daemon stop
 * Stopping Lazy daemon ... [ ok ]

$ sudo s6-svstat /home/test/openrc/lazy-daemon
down (signal SIGTERM) 19 seconds, normally up
Received on Sat Jul 25 2015 - 19:43:28 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC