Re: runit kill runsv

From: Thomas Lau <tlau_at_tetrioncapital.com>
Date: Thu, 23 Jun 2016 08:39:06 +0800

here is the run script:

#!/bin/sh
exec 2>&1
echo "*** Starting service ..."
RUNASUSER="tlau"
RUNASUID=$(getent passwd $RUNASUSER | cut -d: -f3)
RUNASGROUPS=$(id -G $RUNASUSER | tr ' ' ':')
exec chpst -u :$RUNASUID:$RUNASGROUPS /usr/bin/memcached -vvv -m 64


I just tested -P, doesn't help and I could kill runsv process, memcached
daemon still running.

I know OOM might not kill it, just trying to simulate what happen, who
knows when I was working on a system at 3AM in the morning and accidentally
kill runsv? :) I want to find out how is runit fault tolerance level.

On Wed, Jun 22, 2016 at 11:51 PM, Colin Booth <cathexis_at_gmail.com> wrote:

> On Wed, Jun 22, 2016 at 8:08 AM, Avery Payne <avery.p.payne_at_gmail.com>
> wrote:
> > It almost sounds like you need to chain-load memcached using chpst. If
> > memcached has internal code to change its process group then it is
> > "escaping" supervision, which means that runsv is not in direct control
> of
> > it. To fix this, your ./run script would be similar to:
> >
> > #!/bin/sh
> > exec 2>&1
> > exec chpst -P memcached
> >
> > See http://smarden.org/runit/chpst.8.html for details. This would cause
> > memcached to be "captive" to the runsv process. Try the change with
> chpst
> > and see what happens. You may find other issues you're not seeing after
> you
> > make this change; check the log with tail -f /path/to/log/file and see
> if it
> > is restarting over and over (a "restart loop").
> >
> Memcached doesn't do anything fancy like that, at least not if you run
> it in foreground mode. Testing against an unused memcached instance at
> work, the problem really does seem to be the one I described: runsv is
> catching some non-SIGTERM signal, exiting, and orphaning the memcached
> that it's managing.
>
> Thomas, can you post your entire run script? Make sure there's nothing
> glaring. At the least your last line should be something like:
> `exec /usr/bin/memcached -v -m 512 -p 11211 -u nobody -c 1024'
>
> It does seem suspicious that runsv is getting killed but memcached is
> surviving. I've seen systems under impressive amounts of duress that
> have had their runsvdir and runsv processes survive (they are unlikely
> to be OOM targetted, they don't grow, etc).
>
> Cheers!
>
> --
> "If the doors of perception were cleansed every thing would appear to
> man as it is, infinite. For man has closed himself up, till he sees
> all things thru' narrow chinks of his cavern."
> -- William Blake
>



-- 
Thomas Lau
Director of Infrastructure
Tetrion Capital Limited
Direct: +852-3976-8903
Mobile: +852-9323-9670
Address: Suite 2716, Two IFC, Central, Hong Kong
Received on Thu Jun 23 2016 - 00:39:06 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC