DHCP failover problems - still

Matt Causey matt.causey at gmail.com
Fri Nov 20 03:38:41 UTC 2009

On Thu, Nov 19, 2009 at 6:41 PM, Matt Causey <matt.causey at gmail.com> wrote:
>> Unfortunately there is not a great deal of useful logging around
>> these events.  A socket suddenly closing or resetting I think only
>> results in the state changes, whereas a CONTACT message timeout
>> causes a message with the word "timeout" in it to be logged before
>> the state change is logged.
> Ok, so we are seeing this:
> Nov 20 00:10:51 blah-101 dhcpd: DHCPACK on blah to 00:15:70:89:6d:00 via blah
> Nov 20 00:11:32 blah-101 dhcpd: timeout waiting for failover peer failover
> Nov 20 00:11:32 blah-101 dhcpd: peer failover: disconnected
> Nov 20 00:11:33 blah-101 dhcpd: failover peer failover: I move from
> normal to communications-interrupted
In case anyone is interested, I've attached a small capture...like 108
frames.  The ones before the first RST are normal traffic...seems that
the socket drops and gets re-started every so often (you'll note the
3-way handshake just before the RST....not sure if that is normal or

After that first RST-ACK, things seem to recover.  And then we see the
contacts/binding updates stop.  Then a string of attempts to re-start
the socket (3-way handshake/RST-ACK combo).  And it just carries on
that way until I bounce one of the dhcp daemons.

We're running an identical configuration in 30+ pairs in different
locations, and this is the only site with this problem - so I'm
certain it's not a dhcpd design flaw as such.   But I'd like to learn
more about what's happening, so we can either fix it in the software,
or I can add some automation around it to detect and repair this


-------------- next part --------------
A non-text attachment was scrubbed...
Name: failover_xaction_failure.pcap
Type: application/octet-stream
Size: 18634 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20091119/76eb102d/attachment.obj>

More information about the dhcp-users mailing list