Primary server stuck in "recovering" on restarts

Wed Jul 30 16:57:16 UTC 2014

This is likely caused by our addition of a "gentle shutdown" feature in 4.2.6 and 4.3.0.
In this we added a signal handler to collect some signals and shut the serve down
cleanly.  Unfortunately one side effect of this change was to put the peer into partner-down.

We have backed out this change for 4.2.7 and 4.3.1 (both currently in beta, if people are
testing them we'd like to hear about your results).

In the meantime if this is the problem you should be able to avoid it by using a hard kill
such as "kill -9" to stop the process.

Shawn

On Jul 30, 2014, at 9:13 AM, Patrick Trapp wrote:

> We recently upgraded to 4.2.6 for our ISC DHCP implementation (using failover) on a pair of Debian servers. We specifically needed the patch for the longer bootstring parameters before we upgrade other systems on our network, but it was not available from Debian packages yet, so we manually compiled the necessary changes in.
> 
> We have a lot of minor tweaks to host entries and group entries in our configuration - maybe up to a dozen a day - so we have scripted out some of the steps for synchronizing our configurations between the servers. The last script stops the primary server, restarts it, stops the secondary server, and restarts it. This has worked great for years and was working on 4.2.2 (the most recent version prior to this patch/upgrade).
> 
> However, whenever we run our script to restart the two servers now, the primary server gets stuck in a recovering state and never starts offering out addresses. I have gotten it to straighten out by stopping it again manually and restarting it a few times. I have had to restart the secondary server before the primary seemed to wake up at least once - but that might have been me being impatient waiting for the primary to wake up.
> 
> Is there something different in 4.2.6 that we need to accommodate for our restart script? Or have we been playing with fire all this time and our luck just ran out? Or, to put it another way, how would be best to implement this restart? We are not doing anything with changing states on either server to partner-down or anything like that - should we be?
> 
> Appreciate any suggestions.
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20140730/f84ef846/attachment.html>