DHCP failover problems - still
rblayzor.bulk at inoc.net
Tue May 12 15:01:06 UTC 2009
On May 7, 2009, at 5:25 PM, David W. Hankins wrote:
> We have at least two bugs with failover reconnection attempts, of
> which one has a fix comitted to maintenance branches (will be in
> 4.1.1-next, 4.0.2b1, etc), and the other is being reviewed for
This, at least in our situation, does not get to the root of the
problem. The reconnection and keepalives are certainly an issue when
the links separate between two peers. In our situation however, we
notice that active servers (with several messages a second between
them) just suddenly think that each one has stopped responding.
Usually one logs the other has timed out, disconnects and then 20
seconds later the other one logs a timeout. All this happens even
when all other network connectivity between the two servers is fine.
Then of course, they never reconnect again.
> Failover will claim a link has been disconnected if it is idle (no
> received messages) for more than the configured max-response-delay
> (default 20 seconds), or if the socket has been disconnected. A
> contact message is used at 1/3rd the max-response-delay to keep the
> socket from going idle.
Is this something new in only 4.x ? We don't think we've seen this in
Robert Blayzor, BOFH
rblayzor at inoc.net
More information about the dhcp-users