DHCP failover problems - still
matt.causey at gmail.com
Thu Nov 19 05:57:57 UTC 2009
I was re-reading this post from awhile ago, and had a question....
On Thu, May 7, 2009 at 1:25 PM, David W. Hankins <David_Hankins at isc.org> wrote:
> On Wed, May 06, 2009 at 10:18:02PM +0200, sthaug at nethelp.no wrote:
>> This is a bug that I and several others have seen. I can reproduce it,
>> and I have tried to give ISC enough info to reproduce it (offering the
>> use of my lab if necessary). But so far no luck. See
> Sorry I never got back to you Steiner, your last set of information
> was complete enough that I could reproduce it trivially.
> We have at least two bugs with failover reconnection attempts, of
> which one has a fix comitted to maintenance branches (will be in
> 4.1.1-next, 4.0.2b1, etc), and the other is being reviewed for
> Failover will claim a link has been disconnected if it is idle (no
> received messages) for more than the configured max-response-delay
> (default 20 seconds), or if the socket has been disconnected. A
> contact message is used at 1/3rd the max-response-delay to keep the
> socket from going idle.
If I'm looking at a rash of random 'dhcpd: failover peer failover: I
move from normal to communications-interrupted' events - is there a
good way for me to know which happened? i.e. was it a lack of contact
messages, or a disruption of the TCP socket? I'm going to try to run
some tcpdump on the system to catch when the disconnect happens
(tcpdump -w failver.pcap -xX -s0 -n -i eth0 'port 519 or port 520 and
tcp') but I wondered if you knew of a better way to track down a
problem where 2 server randomly stop communicating via the failover
channel.....but all other services on the boxen are fine.
> David W. Hankins "If you don't do it right the first time,
> Software Engineer you'll just have to do it again."
> Internet Systems Consortium, Inc. -- Jack T. Hankins
> dhcp-users mailing list
> dhcp-users at lists.isc.org
More information about the dhcp-users