failover, partner-down state, MCLT and rewind binding

Gregory Sloop gregs at sloop.net
Tue Nov 10 16:23:00 UTC 2015



CA> On 10/11/2015 01:43, Gregory Sloop wrote:
>> So, I'm looking for a little more understanding. I had an outage last
>> week that didn't work out so well.
>> I've had sort-of-similar problems in the past with this setup, and I
>> *think* I know some of what happened this time, but wanting confirmation.

>> After the last similar outage, I knew we needed to put the surviving
>> peer in "partner-down" mode, and this, along with the new "rewind state

CA> Hi Greg,

CA> I've just (belatedly, remembering that they had been written, but not
CA> able to find them readily), made the two KB articles below published/public.

CA> Hoping that they help - particularly around tweaking the default
CA> failover settings and why a very small MCLT is not necessarily a good idea.

CA> https://kb.isc.org/article/AA-00268/31/DHCP-Failover-and-MCLT-configuration-implications.html

CA> https://kb.isc.org/article/AA-00327/31/Why-are-the-lease-times-short-and-random-during-communication-interrupted-state.html

CA> Cathy

Thanks Cathy. But those documents don't add a lot of light to the discussion.

In specific: 

1) I understand why the lease extensions are different times and not just the MCLT time. BUT - the lease extensions should *NEVER* be shorter than the MCLT time, right? [They'll be of varying lengths, because there will/may be varying time left on the original leases - from before the fail-over pair went down. But once all the "original" leases have expired, all the remaining leases should be MCLT time, right?]

2) I *think* what I read essentially verifies what I said about MCLT times. If you use really short MCLT times, it's going to put extra load on the environment [network and servers] in even regular mode. [This is because the initial lease, even when running in "communication-normal" mode is for the MCLT time. _After that initial lease_, however, clients will get the regular DHCP lease time. So, I realize that _really_ short MCLT times can adversely impact the performance of your servers both in communications-normal mode, as well as in interrupted mode [as well as other recovery or failure modes]. However, I don't see any indications, _other than performance_, to select longer MCLT times. Do I understand that correctly?

So, is there some reason/benefit, other than performance [load on network, clients and servers] to select longer MCLT times?

And as a corollary,  I think MCLT times, provided your server can handle the load, should be some small fraction of the DHCP lease time. My initial thought - which wasn't encumbered by a lot of deep thought - is around 20-25% of the regular DHCP lease time. That would mean that the server/network should be able to sustain about four-five times the regular load in a failure situation.

Other than answers to the above direct questions - I'm happy for any wider ranging discussion on the thread.

-Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20151110/ca3c96ad/attachment.html>


More information about the dhcp-users mailing list