Re: runsvdir polling

From: Laurent Bercot <ska-supervision_at_skarnet.org>
Date: Sun, 15 Jan 2017 11:38:21 +0000

>How much time in how much uptime? Which process are you looking at with
>this accumulated time?

  That's not an argument.
  Even if runit's (or sysvinit's, or...) polling is completely
unnoticeable
- and on a desktop, to be fair, it definitely is - it doesn't mean that
polling is good.
  Polling is evil for several reasons, the two most important of them
being
the following.

  1. From a software engineering POV: software that depends on underlying
polling has to wait for the polling period to be sure its changes have
been taken into account. When you have a system A that polls, a system B
that depends on A and also polls in another place, etc., the polling
periods compound each other, and this can introduce significant delays.

  I have met a real-life example of this at Google, of all places:
sending
a high-level command to a batch of servers takes several seconds,
sometimes several minutes - not because there's lot of CPU involved, or
high
network delays, but because Google's software stack is big, and most of
its
layers perform polling at some point (processing a message queue every
second, or every few seconds, instead of getting notified when a new
message arrives), and it adds up. Most of Google's high-level commands
(start a service on a cluster of servers, change routing tables to
direct
traffic somewhere else, etc.) spend a whole lot of time doing nothing.
Big scripts using high-level commands, such as automated software
rollouts,
sometimes take hours to complete, whereas the real work involved could
be
accomplished in 10-15 minutes tops; and software designed around polling
is the main culprit.

  2. At the complete opposite end of the scale, from an energy
consumption
POV: you obviously will not notice that on a desktop, but on embedded
devices that are supposed to use energy sparingly (think set top boxes
in
"sleep" mode, battery-powered/handheld devices, etc.), every spurious
wake-up counts. Even if, as runit does, you only wake up once every 14
seconds, it means you pull the CPU out of sleep mode - it cannot sleep
for extended periods of time.

  I have also met a real-life example of this, when I was working at
Sagemcom (a French manufacturer of embedded devices), making the base
system for an energy gateway (the thing that's supposed to collect and
report consumers' energy consumption; obviously, for such a task, it
should NOT use too much energy itself, else consumers will be angry and
have the right to be). I had spent a lot of time and energy trying to
convince the project leader that we should not use D-Bus, to no avail;
his argument was that D-Bus made communication easy with the Java part
(the proprietary application software was in Java). It was making *my*
life hell, but whatever, I could take it, I'm not a Java dev snowflake.
So I made a cool base system with no polling at all, with low-level
software reporting data to D-Bus, and shipped it to the Java people.
Later on, testing showed that the energy consumption was through the
roof,
and the board was brought back to us for analysis; what we discovered
was
that the chosen JVM implemented its D-Bus client by actually polling its
message queue every tenth of a second.
  Yes, the thing was waking up 10 times per second. Really.
  And, obviously, none of the tweaks we made were of any help: increasing
the polling period did not cut down energy consumption quite enough, but
it did cause significant delays in the handling of D-Bus data. I don't
know what became of the project, because I left the team at that time
(my part worked, so there was no reason for me to stay there, so they
put me on another project), but so far I haven't heard of a Sagemcom
energy gateway being deployed in people's homes.

  When you're an init system, i.e. the lowest possible level for
user-space
software, and you *already* introduce polling, well, it doesn't bode
well
for the rest of your software stack. Your energy-saving device is
already
screwed, and automation that relies on runsvdir picking up a new service
is already eating an average 7 second delay. As a desktop user, you
obviously don't care; as a software architect, this makes me shake my
head.
Even systemd does better on that point.

--
  Laurent
Received on Sun Jan 15 2017 - 11:38:21 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC