Is there anything new on the DHCP Failover Horizon? An Update
martin at dc.cis.okstate.edu
Sat Apr 16 12:22:48 UTC 2011
This was one of our remote campuses and though it is in
a large metro area, the wind we had yesterday knocked out power
to parts of that campus so both DHCP servers went down hard. One
came back up on restoration of power and the other tried to but
the system's CMOS configuration had been set to alert on removal
of cover so it just sat there for several hours until we got
someone to go in there and restart it. It came right back up and
the "peer holds all free leases" condition on the subnet in
question resolved itself in about 55 seconds after the downed
host came back to life.
This very good outcome brings me to one more question.
If nobody does anything except restart the system that got in to
trouble, how long should it normally take for both systems to
resolve the imbalance that took place when the one system got
ripped out of the loop?
The lease time on the network in question was 2 days and
we have a secondary problem in that the first messages of "peer
holds all free leases" occurred well before the outage
indicating that both servers are not getting the same feed from
the routers in question so this must be corrected as soon as
possible. Our remote campuses fall victim to this condition more
often than I care to think of. Equipment is older and telling
people that both servers must see the same traffic seems to
almost fall on deaf ears.
My question boils down to what single factor determines
how long it takes to clean up the mess automatically?
If we know that there are slight imbalances, maybe we
can stay ahead of the entropy effect with shorter lease times.
I think our organization started using the bootP server
around 1995 or 96 and, when operated correctly, it just doesn't
I more or less chose a default/maximum lease time of
around 2 days many years ago because it seemed like a good
compromise between churning of the server all day long and
situations in which some networks appear to fill up because of a
very high transient nature.
Any ideas are appreciated by me and, eventually, by our
Martin McCormick WB5AGZ Stillwater, OK
OSU Information Technology Department Telecommunications Services Group
More information about the dhcp-users