Recovery protocol - minimize outage time
dougb at dougbarton.us
Fri Jun 20 03:56:28 UTC 2014
On 06/19/2014 01:00 PM, Amanda Edades wrote:
> What are max-response-delay and mclt times others have used in
> production with success in times of a failure?
That question is like asking, "How long is a piece of string?" The
question lacks the required information to format a reasonable answer.
The right answers for your network(s) depend on the characteristics of
> Is there a way to "override" the lease database transfer to speed it up
> or has anyone attempted to do so? (e.g. override the database transfer
> and perform this manually in a faster way, then manually signal both
> servers that the transfer is done)
You would not want to do that, for a variety of reasons, not the least
of which is that it's overwhelmingly unlikely that you would be able to
optimize the process better than dhcpd is already doing it, and you
would be unlikely to get all the right factors taken into account even
if you did. Then of course you'd have to retest your changes for each
new version, etc. ...
> I plan to set mclt time to 60 seconds, making the secondary server issue
> leases of 60 seconds, but also minimizing the waiting period during
> which no servers will respond to requests to 60 seconds.
Um, I'm not sure that means what you think it means. :) Those are
separate parameters, if you want to set "max-response-delay 60;" that
should be fine. For mclt you really want something longer ... I would
never recommend less than 30 minutes, given that the clients will do a
refresh at 1/2 their lease time.
> I estimate the
> time to transfer the lease database will be */at least /16 minutes *for
> the number of leases we have - about 350,000. (I observed about 22,000
> leases transferred per minute observed in lab for empty/free leases.)
I don't know for sure if your estimate is accurate, although I suspect
it's not. In any case, why do you care? It will take as long as it
takes, and both servers will do their thing in the meantime.
> In the worst case, I am not sure if a single server can handle
> potentially up to 350,000 leases being renewed every 60 seconds for 16+
> minutes... I plan to test this also, but would like to understand how
> the lease database is transferred to try to save time there.
You are almost certainly attempting to over-engineer this. Start with
the defaults as described in the dhcpd.conf man page. Only make changes
if you understand them, and have a really good reason to make the change.
More information about the dhcp-users