Watchdog for dhcpd

Tue Sep 5 15:23:55 UTC 2006

On Tue, 2006-09-05 at 08:08 -0700, David W. Hankins wrote:
> In my sysadmin days I never liked watchdogs.  I preferred this
> sort of model of "service keepalives":
> 
> 	#!/bin/sh
> 
> 	(while true ; do
> 		dhcpd -f >/dev/null 2>&1
> 		msg="DHCP exited with status $? on `hostname`"
> 		logger -p daemon.crit $msg
> 		echo $msg | Mail -s 'Service failure' sysadmins at example.com
> 		# Very important - if the config file is bogus or some
> 		# other pesistently bad condition, do not spin...as fast.
> 		sleep 5
> 	done) &

Simple and good, but it makes it hard to stop or restart the service
without a bogus error message. A small improvement to the above is to
have one script that does basically the above, but instead of logging a
message etc, it just sets a flag somewhere (such as by creating a
sentinel file). You then have a second script that watches the flag and
logs the error etc if the flag ever appears.

That way, you can stop the service cleanly by stopping the flag-watcher,
stopping the daemon, and then clearing the flag.

We don't actually auto-restart our DHCP servers, crucial though they
are. We find failover is so reliable, that we rely on it to carry us
over any minor disturbances. Anything that takes out both servers will
be big enough that restarting them won't help much...

Regards, K.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer at biplane.com.au)                   +61-2-64957160 (h)
http://www.biplane.com.au/~kauer/                  +61-428-957160 (mob)