Is there anything new on the DHCP Failover Horizon?
martin at dc.cis.okstate.edu
Fri Apr 15 20:55:32 UTC 2011
We have been using DHCP failover for several years and
like the fact that one server can die and no phones ring for a
while, at least.
The problems have been when the real world that never
fails to prove the old saying that if anything can go wrong, it
will, springs one of its little surprises such as wireless
controllers that end up sending different data to both servers
or, as we had today, weather-induced power hits that appear to
have brought down one server while the other one stayed up but
in a "peer holds all free leases" lock-down state.
That has caused some staff members to ask whether
failover really buys us any redundancy. I know it does when
things fail cleanly such as when one server's dhcpd process dies
or the power goes away cleanly from a box but I was asked to see
if there are any other failover strategies that might be in
consideration that self-heal a bit faster.
The discussions almost always start when we discover
that one server in the pair hagone in to "peer holds all free
leases" condition and people are not getting leases. Is there a
rapid way to clear it, when discovered?
Of course, the real cure is to not send different data
to both servers. We serve around 10,000 clients here and can go
for months without a single "peer holds all" messages, but when
something goes wrong, we get a situation that at least with one
DHCP server does not monkey-wrench the whole subnet.
Telling everyone that it is only that one subnet, etc,
is a hard sell. The voices just get louder and the questions
Any ideas as to the best way to handle "no free leases"
are appreciated as I have never found anything that was really
clean since the condition that causes it is by nature an error
and dhcpd is simply trying to be as safe as possible.
More information about the dhcp-users