s6-linux-init: the s6-linux-init-maker program

The `s6-linux-init-maker` program

s6-linux-init-maker reads configuration options on the command line, and outputs a directory to place in the root filesystem. That directory contains a script that is suitable as an /sbin/init program as well as all the necessary files that this script needs to properly boot and bring up a full s6 infrastructure.

s6-linux-init-maker only writes scripts. At boot time, these scripts will call commands provided by other skarnet.org packages such as execline and s6. It is the responsibility of the administrator to make sure that all the dependencies are properly installed at boot time, and that the correct options have been given to s6-linux-init-maker so that the programs are found on the root filesystem of the machine. If it is not the case, the system will fail to boot.

Interface and usage

     s6-linux-init-maker \
       [ -V boot_verbosity ] \
       [ -c basedir ] \
       [ -u log_user ] \
       [ -G early_getty ] \
       [ -1 ] \
       [ -L ] \
       [ -p initial_path ] \
       [ -m initial_umask ] \
       [ -t timestamp_style ] \
       [ -d slashdev ] \
       [ -s env_store ] \
       [ -e initial_envvar ] ... \
       [ -q finalsleeptime ] \
       [ -D initdefault ] \
       [ -n | -N ] \
       [ -f skeldir ] \
       [ -R resource_limit_list ] \
       [ -C ] \
       [ -B ] \
       [ -S ] \
       [ -W readyfd ] \
       dir

s6-linux-init-maker must be run on the machine that will boot an s6-based system.
It normally should be run as root. It supports not running as root for a small amount of very specific cases; but you should run it as root unless you know exactly what you are doing.
s6-linux-init-maker parses options on its command line.
It writes data into a directory dir, which must not exist beforehand.
It exits 0 if everything went well, 100 if a user error occurred, and 111 if a problem occurred during the creation of the directory or its contents.

Once the command has been run and dir has been created, there are a few manual steps to take:

s6-linux-init-maker has copied some scripts from the /etc/s6-linux-init/skel directory (or the directory you gave as an argument to the --skeldir configure option at build time) to the dir/scripts directory. You should edit these scripts and adapt them to your use case. (Or you could edit the skeleton scripts before running s6-linux-init-maker.) The scripts are:
- rc.init: this script will be run as stage 2 initialization, i.e. the initialization that happens once s6-svscan is running as process 1, and should contain all your normal system bootup tasks. Typically, it should initialize the service manager and then order it to bring the machine state to its fully operational state. rc.init is given the default runlevel as a first argument (i.e. the name of the state the machine should be brought to, traditionally default for OpenRC and 2 or 5 for sysv-rc), and the rest of the command line is made of the kernel's command line except for the kernel arguments of the key=value form, which have been stored into env_store. If the -C option has been given to s6-linux-init-maker and the system is indeed running in a container, the rest of the command line is just the command line that has been given to the container's init (e.g. for Docker: the CMD). Note that the runlevel script should not be invoked in a container, which does not have a notion of runlevels.
- rc.shutdown: this script will be run as the shutdown sequence, when the administrator runs the shutdown, halt, poweroff or reboot command. (As well, for non-containerized systems, as init 0, init 6, telinit 0 and telinit 6 for sysvinit compatibility reasons.) It should ask the service manager to bring all the services down, and exit when it's done (in other words: it should not try to perform a hard halt/poweroff/reboot itself.) No arguments are given to this script.
- runlevel: this script will be invoked for every runlevel change, i.e. change of machine states. It is given one argument: the name of the runlevel to change to. Typically, the runlevel script should just invoke the service manager, asking it to bring the machine state to the wanted runlevel. In a containerized system, this script should not be used at all.
Copy the dir directory to the place declared as basedir (/etc/s6-linux-init/current by default). Be careful: it contains fifos, files with precise uid/gid permissions, and files with non-standard access rights, so be sure to copy it verbatim. The s6-hiercopy tool can do it, as well as the GNU or busybox cp -a or mv commands.
Back up your /sbin. Then copy, link or symlink all the scripts and symlinks in the basedir/bin directory into /sbin. In particular, the basedir/bin/init script should be accessible as /sbin/init.

Boot sequence

When the kernel boots, it may run an initramfs first, but in any case it then runs the /sbin/init script, also known as stage 1. This script is just an execution of the s6-linux-init program with some command-line options that are directly transferred from the s6-linux-init-maker invocation. Refer to the s6-linux-init man page to know exactly what it does.

s6-linux-init-maker options

-V boot_verbosity : how verbose the boot will be. Default is 1, which means that only important warnings will be printed. Increasing this value may yield more, but usually harmless, warning messages.

-c basedir : at boot time, stage 1, which should be accessible as basedir/init, will read its read-only data from basedir. After running s6-linux-init-maker, you should make sure to copy the created directory dir to basedir. basedir must be absolute. Default is /etc/s6-linux-init/current.

-u log_user : the catch-all logger will run as the log_user user. Default is root.

-G early_getty : if this option is set, s6-linux-init-maker will define an additional s6 service that will be named s6-linux-init-early-getty and started at the same time rc.init is executed. This early service should be a getty, or equivalent, to allow logins even if stage2 fails. early_getty should be a simple command line: for instance, "/sbin/getty 38400 tty1". By default, no early service is defined.

-1 : make it so that all the messages that are sent to the catch-all logger (i.e. all the error messages that are not caught by a dedicated logger, as well as the output from rc.init, runlevel and rc.shutdown) are also copied to /dev/console. (Timestamps are not copied to /dev/console.) This is generally useful to debug a system at a glance, but if a failing program keeps sending error messages, it may interfere with comfortable usage of an early getty. A common workaround is to make the early getty start on tty2 and leave tty1 for /dev/console to print on.

-L : add an early s6-linux-init-logouthookd service to clean up utmp records at user logout time. Check the s6-linux-init-logouthookd page for details.

-p initial_path : the initial value for the PATH environment variable, that will be transmitted to all the starting process unless it's overridden by a PATH declaration via the -e option. It is absolutely necessary for execline and s6 binaries to be accessible via initial_path, else the machine will not boot. Default is /usr/bin:/bin.

-m initial_umask : the value of the initial file umask for all the starting processes, in octal. Default is 022.

-t timestamp_style : how logs are timestamped by the catch-all logger. 0 means no timestamp, 1 means external TAI64N format, 2 means ISO 8601 format, and 3 means both. Default is 1.

-d slashdev : mount a devtmpfs. If this option is given, s6-linux-init will mount a devtmpfs pseudo-filesystem on slashdev. This is useful if the kernel has not been configured to mount the devtmpfs at boot time and there is no static /dev. By default, it is assumed that there is a suitable /dev at boot time, and no additional devtmpfs is mounted.

-s env_store : stage 1 init sometimes inherits a few environment variables from the kernel. (These variables correspond to the arguments on the kernel command line that are of the form key=value.) It empties its environment before spawning rc.init and executing into s6-svscan, in order to prevent those "kernel" environment variables from leaking into the whole process tree. However, sometimes those variables are needed at a later time; in that case, giving the -s option to s6-linux-init-maker makes stage 1 init dump the "kernel" environment variables into the env_store directory (under a format that is later readable with s6-envdir -f) before erasing them. env_store should obviously be a writable directory, so it should be located under /run (or your chosen tmpfsdir)! If this option is not given, the environment inherited from the kernel isn't saved anywhere - which is the default.

-e initial_envvar : this option can be repeated. For every initial_envvar, s6-linux-init-maker will adjust the global environment directory in dir/env. initial_envvar must either be of the form VAR, to make sure that VAR does not appear in the global environment, or of the form VAR=VALUE, to add an environment variable VAR with the value VALUE. The global environment is the environment that every supervised process (as well as the rc.init script) will run with, so it will be inherited by default by every process running on the system. The TZ variable, for instance, is a good candidate to be set in the global environment.

-q finalsleeptime : when the machine shuts down, all processes that have not already been killed during shutdownscript will receive a SIGTERM or a SIGHUP to allow them to exit gracefully; then, after finalsleeptime milliseconds, they will receive a SIGKILL and the shutdown sequence will go on. This option configures the amount of time that will elapse between the SIGTERM/SIGHUP and the SIGKILL. Default is 3000, meaning a grace period of 3 seconds.

-D initdefault : boot the system with a runlevel set to initdefault, which can be an arbitrary string, but is usually 2, 3, 5 (traditional sysvinit behaviour) or default (OpenRC behaviour). Default is default. Note that if a 2, 3, 4, 5, or default argument is encountered in the kernel command line, it will be interpreted as the runlevel to boot the system on, and will override the default given here.

-n : at boot time, assume that a tmpfs is already present on /run (or the argument that was given to the --tmpfsdir configure option at build time) and that its contents are essential. Instead of unmounting /run then mounting a tmpfs on it, s6-linux-init will simply remount /run. This option is useful when s6-linux-init is used on a distribution that imposes its initramfs and said initramfs writes data to /run that is then used by the distribution's initialization scripts. (An initramfs should normally be transparent and leave no trace in the filesystem; unfortunately, a lot of distributions do not care.) By default, /run will be unmounted at boot time (just in case), and then a tmpfs will be mounted on it. Do not use this option if you are not sure: failure to remount /run will cause init to die and the kernel to panic. This option is incompatible with the -N option.

-N : at boot time, do not perform mounting/unmounting/remounting on /run (or the tmpfsdir declared at build time) at all. By default, a tmpfs is mounted on /run at boot time. This option is useful when s6-linux-init is used to boot on an initramfs that will remain the de facto rootfs of the system (which is the case for instance in certain live CDs or certain embedded devices), in which case the rootfs is already read-write and in RAM and mounting an additional tmpfs is unnecessary. Do not use this option if your rootfs is read-only: failure to write to /run will cause init to die and the kernel to panic. This option is incompatible with the -n option.

-f skeldir : copy the skeleton scripts from directory skeldir. By default, skeldir is /etc/s6-linux-init/skel, or the directory that has been given as an argument to the --skeldir configure option at build time. This option is typically useful when distributions run s6-linux-init-maker in packaging scripts, when preparing files in a staging directory.

-R resource_limit_list : declare global resource limits (a.k.a. "hard limits") for the system to be booted. resource_limit_list is a comma-separated list of instructions such as o2000, d= or c0: a letter followed by either the character =, which means unlimited, or a number, which is the value of the resource limit. The letter specifies the resource being addressed, as defined by the option letters used by s6-softlimit: for instance, c means core file size limit, and o means open fds limit. Note that unlike s6-softlimit, which only sets soft limits, i.e. process hierarchy-wide limits, the values given here declare hard limits that will be enforced for the whole system to be booted: it will be impossible to raise soft limits above these values. Warning: misuse of this option is likely to make your system unbootable; make sure you don't prevent process 1 and the whole process hierarchy from allocating enough resources.

-C : create a set of scripts that is suitable for running in a container. This modifies some behaviours:
- SIGTERM will be caught by s6-svscan, and cause an orderly shutdown of the container, as if the "poweroff" script had been invoked.
- No early runleveld service is created. Changing runlevels via s6-linux-init-telinit will be unsupported in a container.
- Consequently, the first argument to the rc.init script will always be default (or initdefault if the -D option has been given to s6-linux-init-maker). The rest of the arguments to the rc.init script will be the arguments given to the init program when running the container.
- If the -s option has been given, env_store will contain the initial environment given to the container.
- The ultimate output fallback (i.e. the place where error messages go when nothing catches them, e.g. the error messages from the catch-all logger and the s6-supervise process managing the catch-all logger) is not /dev/console, but the descriptor that was init's standard error.
- Stopping the container with reboot will make the container's init program report being killed by a SIGHUP. Stopping it with poweroff will make it report being killed by a SIGINT. This is according to the reboot(2) specification.
- Stopping the container with halt, however, is different. It will make the container's pid 1 read a number in the /run/s6-linux-init-container-results/exitcode file (the /run prefix can be changed at build time via the --tmpfsdir configure option), and exit with the code it has read. (Default is 0.) This means that in order to run a command in a container managed by s6-linux-init and exit the container when the command dies while reporting the exit code to its parent, you should:
  - Run that command via rc.init
  - Store its exit code in the /run/s6-linux-init-container-results/exitcode file
  - Call halt
  All the running services will be killed, all the zombies will be reaped, and the container will exit with the required exit code.

-B : run the system without a catch-all logger. On a non-containerized system, that means that all the logs from the s6 supervision tree will go to /dev/console, and that /dev/console will also be the default stdout and stderr for services running under the supervision tree: use of this option is discouraged. On a containerized system (when paired with the -C option), it simply means that these outputs go to the default stdout and stderr given to the container's init - this should generally not be the default, but might be useful in some cases.

-S : when used with the -C option, set up the container so the disks are synced on container halt. By default, no sync is performed. This option has no effect when the -C option is not present: on real machines, a sync is always performed just before a system halt.

-W readyfd : ensure that at boot time, before doing anything, s6-linux-init waits for file descriptor readyfd to signal EOF. This is typically useful in containers that implement the Docker synchronization mechanism, where the container manager starts the container with a pipe open to the container's descriptor 3, does its preparation, and closes the pipe to tell the container's init that it can proceed. If this option is not given, or readyfd is 0, s6-linux-init-maker makes no provision for synchronization and s6-linux-init will boot without waiting.

Organization of the created directory

If s6-linux-init-maker returns successfully, dir contains data that will be used at boot time. (Actually, basedir will be used at boot time, not dir. Do not forget to copy dir to basedir once you have checked you are happy with what s6-linux-init-maker has created.)

This boot-time data is made of several subdirectories:

bin: this subdirectory contains scripts and symlinks that should be copied to /sbin or /bin. There is an init program performing stage 1 init, a telinit program to change runlevels, and utilities to order a machine shutdown.
env: this subdirectory is the envdir that is used to store the global environment. It will be read at boot time by stage 1 init, and transmitted to all spawned processes.
scripts: this subdirectory contains a copy of the skeleton scripts that have been installed in /etc/s6-linux-init/skel (or the argument to the --skeldir configure option at build time). These scripts should be edited before booting. They are described above.
run-image: this is a file hierarchy that will be copied verbatim at boot time to the newly made and mounted /run tmpfs (or whatever your tmpfsdir is). The subdirectories it contains are the following:
- uncaught-logs: this is the directory where the catch-all logger will store and rotate the error messages produced by the s6 supervision tree and the services that do not redirect their own logs. Not present if the -B option has been given.
- service: /run/service will be the scandir. It initially contains a .s6-svscan subdirectory that tells s6-svscan what to do if it receives a signal (typically via the ctrlaltdel combination) and ensures a hard reboot if s6-svscan ever fails. It also contains a list of early services, i.e. s6 services that will be run at boot time as soon as s6-svscan is executed. These services are:
  - s6-svscan-log: the catch-all logger. Not present if the -B option has been given.
  - s6-linux-init-shutdownd: a service that listens to shutdown commands such as reboot and triggers the software shutdown procedure.
  - s6-linux-init-runleveld: a service that listens to runlevel change commands such as telinit and calls the runlevel script in a reproducible environment to bring the machine to the wanted state. Not present if the -C option has been given.
  - s6-linux-init-logouthookd: the "clean up user utmp records at logout time" service. See the s6-linux-init-logouthookd page for details. Not present if the -L option has not been given.
  - s6-linux-init-early-getty: the early getty service, that will allow a user to log in even if rc.init fails to bring the machine to a state where logins are possible. Not present if the -G option has not been given.

Notes

A directory created by s6-linux-init-maker is only valid on the machine it has been created on. Pre-creating init directories for other machines is not supported. Of course, the scripts are editable, so advanced users can run s6-linux-init-maker to create a basic template, and then make their own modifications.

After booting, basedir should remain untouched during the lifetime of the machine, because the machine state change and shutdown procedures will look for data in basedir. New invocations of s6-linux-init-maker should use a different basedir.

The difficult parts of running s6-svscan as process 1 are:

The fact that the supervision tree requires writable directories, so in order to accommodate read-only root filesystems, there needs to be a tmpfs mounted before s6-svscan is run.
The catch-22 coming from the need to redirect the supervision tree's output away from /dev/console (which is fine for a first process invocation but impractical for log management of a whole process tree) and into a logger that is itself managed by the supervision tree it's reading data from.
Keeping appearances of compatibility with another init system is difficult: in particular, the mechanisms around the shutdown procedure are fundamentally different from about any other init system, so even a simple command such as reboot needs an ad-hoc implementation.
Even for simple systems such as containerized ones, making sure that the wanted commands only run when s6-svscan is ready requires a bit of manipulation.