aboutsummaryrefslogtreecommitdiffstats
s6-rc: why?

s6-rc
Software
skarnet.org

Why s6-rc ?

The limits of supervision suites

Supervision suites such as s6, runit, perp or daemontools define a service as a long-lived process, a.k.a daemon. They provide tools to run the daemon in a reproducible way in a controlled environment and keep it alive if it dies; they also provide daemon management tools to, among others, send signals to the daemon without knowing its PID. They can control individual long-lived processes perfectly well, and s6 also provides tools to manage a whole supervision tree. To any system administrator concerned about reliability, supervision suites are a good thing.

However, a supervision suite is not a service manager.

Relying on a supervision suite to handle all the service management is doable on simple systems, where there aren't many dependencies, and where most of the one-time initialization can take place in stage 1, before any daemons are launched. On some embedded systems, for instance, this is perfectly reasonable.

On other systems, though, it is more problematic. Here are a few issues encountered:

  • With a pure supervision tree, all daemons are launched in parallel; if their dependencies are not met, daemons just die, and are restarted by the supervisors, and so on; so eventually the dependency tree is correctly built and everything is brought up. This is okay with lightweight daemons that do not take up too many resources when starting; but with heavyweight daemons, bringing them up at the wrong time can expend CPU or cause heavy disk access, and increase the total booting time significantly.
  • The runit model of separating one-time initialization (stage 1) and daemon management (stage 2) does not always work: some one-time initialization may depend on a daemon being up. Example: udevd on Linux. Such daemons then need to be run in stage 1, unsupervised - which defeats the purpose of having a supervision suite.
  • More generally, supervision suites do not perform dependency management. Their job is to maintain daemons alive and ease their administration; dependency across those daemons is not their concern, and one-time initialization scripts are entirely foreign to them. So a situation like udevd where it is necessary to interleave daemons and one-time scripts is not handled properly by them.

To manage complex systems, pure supervision suites are insufficient, and real service managers, starting and stopping services in the proper order and handling both oneshots (one-time initialization scripts) and longruns (daemons), are needed.

Previous alternatives

Unix distributions usually come with their own init systems and service managers; all of those have flaws one way or another. No widely spread init system gets things right, which is the main reason for the recent "init wars" - no matter what init system you talk about, there are strong, valid reasons to like it and support it, and there are also strong, valid reasons to dislike it.

Non-supervision init systems usually fall in one of two categories, both with pros and cons.

Traditional, sequential starters

Those are either the historical Unix init systems, or newer systems that still favor simplicity. Among them, for instance:

  • sysvinit, the historical GNU/Linux init system, and its companion set of /etc/rc.d init scripts that some distributions like to call sysv-rc. Note that sysvinit does have supervision capabilities, but nobody ever bothered to use them for anything else than gettys, and all the machine initialization, including longruns, is done by the sysv-rc scripts, in a less than elegant way.
  • BSD init, which is very similar to sysvinit - including the supervision abilities that are only ever used for gettys. The /etc/rc script takes care of all the initialization.
  • OpenRC, an alternative, dependency-based rc system.

All these systems run sequentially: they will start services, either oneshots or longruns, one by one, even when the dependency graph says that some services could be started in parallel. Also, the daemons they start are always unsupervised, even when the underlying init system provides supervision features. There usually is no readiness notification support on daemons either, daemons are fire-and-forget (but that's more on the scripts themselves than on the frameworks). Another common criticism of those systems is that the amount of shell scripting is so huge that it has a significant performance impact.

(Note that OpenRC has an option to start services in parallel, but at the time of this writing, it uses polling on a lock file to check whether a service has completed all its dependencies; this is heavily prone to race conditions, and is not the correct mechanism to ensure proper service ordering, so this option cannot be considered reliable.)

Another, less obvious, but important drawback is that service-launching scripts run as scions of the shell that invoked the command, and so they may exhibit different behaviours when they're run automatically at boot time and when they're run manually by an admin, because the environment is different. Scripts usually try to run in a clean environment, but it's hard to think of everything (open file descriptors!) and every script must protect itself with a gigantic boilerplate, which adds to the inefficiency problem.

Monolithic init behemoths

The other category of service managers is made of attempts to cover the flaws of traditional service starters, and provide supervision, dependency management and sometimes readiness notification, while reducing the amount of scripting needed. Unfortunately, the results are tightly integrated, monolithic init systems straying far away from Unix core principles, with design flaws that make the historical inits' design flaws look like a joke.

  • Upstart was the first one. The concepts in Upstart are actually pretty good: in theory, it's a decent event-based service manager. Unfortunately, the implementation is less than ideal. For instance, the service file format is full of adhocisms breaking the principle of least surprise. But most importantly, Upstart was the first system that used ptrace on the processes it spawned in order to keep track of their forks. If you don't know what that means: it's complete insanity, using a debug feature in prodution, with heavy impact on security and efficiency.
  • launchd, Darwin's init and service manager. The Wikipedia page (linked here because Apple doesn't see fit to provide a documentation page for launchd) is very clear: it replaces init, rc, init.d/rc.d, SystemStarter, inetd, crontd, atd and watchdogd. It does all of this in process 1. And it uses XML for daemon configuration, so launchctl has to link in an XML parsing library, and it communicates with process 1 via a Mach-specific IPC mechanism. Is this the sleek, elegant design that Apple is usually known for? Stick to selling iPhones, guys.
  • systemd, the main protagonist (or antagonist) in the "init wars". It has the same problems as launchd, up by an order of magnitude; here is why. systemd avowedly aims to replace the whole low-level user-space of Linux systems, but its design is horrendous. It doesn't even get readiness notification right.

The problem of integrated init systems is that:

  • They have been developed by companies and associations, not individuals, and despite the licensing, they are for all intents and purposes closer to proprietary software than free software; they also suffer from many of the technical flaws of enterprise software design.
  • As a result, and despite being backed by tremendous manpower, they have been very poorly thought out. The manpower goes into the coding of features, not into architecture conception; and the architects were obviously not Unix experts, which is a shame when it's about creating a process 1 for Unix. This is apparent because:
  • They have been designed like application software, not system software, which requires a fairly different set of skills, and careful attention to details that are not as important in application software, such as minimal software dependencies and shortness of code paths.

Pages and pages could be - and have been - written about the shortcomings of integrated init systems, but one fact remains: they are not a satisfying solution to the problem of service management under Unix.

The best of both worlds

s6-rc aims to be such a solution: it is small and modular, but offers full functionality. Parallel service startup and shutdown with correct dependency management (none of the systemd nonsense where services are started before their dependencies are met), correct readiness notification support, reproducible script execution, and short code paths.

  • s6-rc is a service manager, i.e. the equivalent of sysv-rc or OpenRC. It is not an init system. You can run s6-rc with any init system of your choosing. Of course, s6-rc requires a s6 supervision tree to be running on the system, since it delegates the management of longrun services to that supervision tree, but it does not require that s6 be the init system itself. s6-rc will work when s6-svscan runs as process 1 (on Linux, such a setup can be easily achieved via the help of the s6-linux-init package), and it will also work when s6-svscan runs under another init process.
  • The service manager runs on top of a supervision suite. It does not try to make it perform boot/shutdown operations or dependency management itself; and it does not substitute itself to it. s6-rc uses the functionality provided by s6, but it is still possible to run s6 without s6-rc for systems that do not need a service manager. It would also be theoretically possible to run s6-rc on top of another supervision suite, if said supervision suite provided the hooks that s6-rc needs.
  • A significantly time-consuming part of a service manager is the analysis of a set of services and computation of a dependency graph for that set. At the time of writing this document, s6-rc is the only service manager that performs that work offline, eliminating the dependency analysis overhead from boot time, shutdown time, or any other time where the machine state changes.
  • The source format for the s6-rc-compile tool is purposefully simple, in order to allow external tools to automatically write service definitions for s6-rc - for instance for conversions between service manager formats.
  • Like every skarnet.org tool, s6-rc is made of very little code, that does its job and nothing else. The binaries are small, it is very light in memory usage, and the code paths are extremely short.

The combination of s6 and s6-rc makes a complete, full-featured and performant init system and service manager, with probably the lowest total memory footprint of any service manager out there, and all the reliability and ease of administration that a supervision suite can provide. It is a real, viable alternative to integrated init behemoths, providing equivalent functionality while being much smaller and much more maintainable.