Re: taxonomy of dependencies

From: Jonathan de Boyne Pollard <>
Date: Thu, 14 May 2015 22:58:56 +0100

Wayne Marshall:
> Under a supervision framework, failure of a service starting is absolutely ok. (Many novices fail to grasp the elegance of this essential feature.)

... and novices and non-novices alike fail to grasp its unscalability.
It may be fine on a hobbyist PC, but on a server in a datacentre one
gets situations like a program that needs two database servers and a
message queue broker to be up and ready before it can run, which one is
running 10 instances of for scalability. 10 client programs crashing
and restarting over and over whilst rabbitmq-server and mysqld are
trying to come up do not make for a happy startup. "I want", says the
system administrator, "my machine to spend its precious processor and
disc on bringing up the things that everything is waiting for, not on
repeatedly starting and crashing the things that are doing the
waiting." Let us not forget the logfile and monitoring system noise
that the thundering herd approach engenders, too.

Two things make this world more tolerable: early server socket opening
and readiness protocols. Unfortunately, much "enterprise" software has
yet to even embrace the former, let alone the latter. But there are some
promising tiny green shoots. Early server socket opening makes clients
_block_ rather than _abend_. Readiness protocols fill in corner cases
that aren't necessarily strictly client-server, and also deal with the
fact that "up for over N seconds" may or may not mean "ready" according
to what day of the week it is (i.e. what the system activity pattern
happens to be at the time).

Wayne Marshall:
> Note also that in no case is it necessary for a service runscript to try starting dependencies itself -- this is all left to the supervisor.

It need not even be the purview of the service manager. nosh doesn't do
dependency processing in either the "run" programs or the service
manager. It does it in the "system-control" program. Dependency
processing is "policy", the decision of what to start and what to stop,
in what order, and when. Service management is "mechanism", the raw
mechanics of service state. With this split, one can even have two
"policies", system-control and service-dt-scanner, running at the same
time even. Or someone could come along and write a third, indeed.
Received on Thu May 14 2015 - 21:58:56 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC