Procedure for failover partner replacement.

Thu May 11 22:22:07 UTC 2017

As long as you've put the remaining server into partner-down state, it will remain there until its peer has finished recovery.

Regards,
Chris

> On May 11, 2017, at 8:17 AM, perl-list <perl-list at network1.net> wrote:
> 
> That is an interesting idea, Chris, but in my experience both peers will enter recover mode at step 7 and won't answer dhcp requests until the recover-wait (MCLT) period expires or you manually intervene...  as always YMMV
> 
> 
> From: "Chris Buxton" <clists at buxtonfamily.us>
> To: "Users of ISC DHCP" <dhcp-users at lists.isc.org>
> Sent: Thursday, May 11, 2017 10:38:48 AM
> Subject: Re: Procedure for failover partner replacement.
> On May 11, 2017, at 6:35 AM, Bob McDonald <bmcdonaldjr at gmail.com> wrote:
> > 
> > I've got a failing dhcp failover partner. (the partner is a HA cluster and both nodes are being RMAed. Long story)
> > 
> > My question is this. Is the following procedure ok for the replacement? (I've already confirmed the new version of DHCP is exactly the same as the old one)
> > 
> > 1) before shutting down the failing partner cluster, stop DHCP and save the dhcpd.leases file and the DHCPD.conf file.
> > 2) shut down the failing partner cluster completely.
> > 3) bring up the replacement partner cluster while leaving DHCPD turmed off.
> > 4) restore the DHCPD.leases and DHCPD.conf files.
> > 5) restart DHPCD on the replacement partner cluster.
> > 
> > My contention is that this will result in the failover pair going into partner-interrupted state for about 5 or 10 minutes while the HA cluster is replaced and then should restart communications as if nothing happened when the replacement partner comes live. Thoughts?
> 
> Here is what I would do:
> 
> 1. On both failover peers (both clusters), set 'max-unacked-updates 1000;'.
> 2. Save the old dhcpd.conf and any included files from the failing peer cluster. Do not save the leases file.
> 3. Shut down the failing cluster completely.
> 4. Put the remaining failover peer into partner-down state.
> 5. Bring up the replacement cluster with dhcpd not running.
> 6. Restore the dhcpd.conf (including the 'max-unacked-updates' statement.
> 7. Start dhcpd on the replacement cluster.
> 
> At step 3, the remaining peer will move to communications-interrupted. But step 4 will change this, so that you don't have to worry about pool exhaustion during steps 5 and 6. At step 7, the new peer will move to recover state, sync with the master, and then move to normal state. At that point, the other peer will automatically move from partner-down to normal state.
> 
> Regards,
> Chris
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users
> 
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20170511/90d9eb4d/attachment-0001.html>