Re: perp - how to notify if service suddenly starts dying all the time
 
On Thu, 16 Jul 2015 09:52:55 +0300
Georgi Chorbadzhiyski <georgi_at_unixsol.org> wrote:
> Yesterday, something have corrupted the database file that Redis uses
> and Redis have crashed and then refused to start.
> 
> I'm using perp to monitor the service and of course perp was doing
> it's job and restarted the service after it died. The problem was
> that I can't think of a way to notify me if a service dies all the
> time. In this case since Redis have never died on me, it'll be enough
> to know it the service have been restarted X times in the last 30
> seconds (for example).
> 
> I can monitor the logs but that doesn't seem like a good idea (to
> start parallel monitor service for each service that is being
> monitored).
> 
> Any ideas?
> 
> Here is how my rc.main script for the service looks like (it is
> pretty standard).
> 
> #!/bin/sh
> 
> exec 2>&1
> 
> TARGET="$1"
> SVNAME="$2"
> 
> [ -z "$SVNAME" ] && SVNAME=$(basename $(readlink -m $(dirname $0)))
> 
> start() {
>         echo "*** $SVNAME: starting..."
>         exec runuid -s redis /usr/bin/redis
> }
> 
> reset() {
>         case "$3" in
>         'exit')
>                 echo "*** $SVNAME: exited status $4 $PERP_SVSECS
> seconds runtime." ;;
>         'signal')
>                 echo "*** $SVNAME: killed on signal $5 $PERP_SVSECS
> seconds runtime." ;;
>         *)
>                 echo "*** $SVNAME: stopped ($3) $PERP_SVSECS seconds
> runtime." ;;
>         esac
>         exit 0
> }
> 
> eval "$TARGET" "$_at_"
> 
> exit 0
> 
Hi Georgi,
Simple way to notify from perp is to send yourself (admin) an email from
within the "reset" target:
...
reset() {
    case "$3" in
    'exit')
      echo "*** $SVNAME: exited status $4 $PERP_SVSECS seconds runtime."
      mail -s "$SVNAME exited" admin_at_myserver.com << END_MAIL
NOTICE:
The $SVNAME service has exited status $4 after runtime of $PERP_SVSECS
seconds.
END_MAIL
    ;;
    'signal')
       echo "*** $SVNAME: killed on signal $5 $PERP_SVSECS seconds
       runtime."
    ;;
    *)
      echo "*** $SVNAME: stopped ($3) $PERP_SVSECS seconds
      runtime."
    ;;
    esac
    exit 0
}
...
The above example shows usage of a generic mail(1) command that may vary
a little among plaforms/mail agents.  Also uses shell "here" document to
generate the body of the email.
This is just a bare bones starting point.  You could embellish this to
suit your own sites' requirements.
Another suggestion is to develop an executable "perp_notify" script that
incorporates the above to provide a consistent notification message,
without having to duplicate within each/every runscript.
All the best,
Wayne
Received on Thu Jul 16 2015 - 12:13:15 UTC
This archive was generated by hypermail 2.3.0
: Sun May 09 2021 - 19:44:19 UTC