svlogd crashes very often on raspberry pi

From: Jacco Ligthart <jacco_at_ligthart.nu>
Date: Fri, 06 Dec 2013 00:27:02 +0100

Hi list,

I use my raspberry to do some continuous logging of serial data. For
this I choose the cereal package on raspberian
(http://cmrg.fifthhorseman.net/wiki/cereal)

Cereal uses runit and svlogd under the hood to do the actual logging.
The good part of this is, for me, that all log lines are prepended with
with a timestamp.

After some time, however, when I wanted to change the rotation frequency
to something time based (versus the default size based) I found that the
rotation time would always be skipped. After much debugging I'm
convinced that my config file is correctly formatted, in the right
directory, readable and actually read by svlogd.

What I think that happens is that svlogd crashes *often* and that the
time based rotation is based on the 'up time' of the svlogd process. I
think that every time the process is restarted the timer starts over.

to demonstrate how much it crashes I did this onliner:
pi_at_blah ~ $ while true; do ps ax |grep svlogd|grep main; sleep 5 ; done
27359 ? S 0:00 svlogd -tt ./main
27369 ? S 0:00 svlogd -tt ./main
27386 ? S 0:00 svlogd -tt ./main
27403 ? S 0:00 svlogd -tt ./main
27420 ? S 0:00 svlogd -tt ./main
27430 ? S 0:00 svlogd -tt ./main
27440 ? S 0:00 svlogd -tt ./main
27450 ? S 0:00 svlogd -tt ./main
27465 ? S 0:00 svlogd -tt ./main
27475 ? S 0:00 svlogd -tt ./main
27490 ? S 0:00 svlogd -tt ./main
27505 ? S 0:00 svlogd -tt ./main
27515 ? S 0:00 svlogd -tt ./main

In this example it can be seen that the process is restarted at least
very 5 seconds. (earlier though I also noticed sometimes a life of 15
seconds for the process)

The exact line that (re)starts svlogd is:
exec chpst -u "$LOGUSER:$LOGGROUP" svlogd -tt ./main <../socket

I have the following version installed:
runit 2.1.1-6.2 armhf system-wide service supervision

I'm a bit at a loss on how to debug this. One of the issues with the
debugging is that "high" cpu usage stops the crashing behaviour. (I
actually notice then that the time based rotation works). This "high"
cpu usage not only occurs when my log mangle script runs, but also if I
try to use strace on svlogd. To be clear about this, svlogd stops
crashing when run under strace.

Besides this I notice that I sometimes miss log lines. Or that sometimes
two log lines are misformed into one.

Any thoughts on what's going on or how to debug appreciated.

Jacco
Received on Thu Dec 05 2013 - 23:27:02 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:18 UTC