Failover peer separation revisted

Robert Blayzor rblayzor.bulk at inoc.net
Wed Nov 19 11:46:44 UTC 2008


On Nov 18, 2008, at 5:40 PM, sthaug at nethelp.no wrote:
> - Watch the traffic on both hosts with packet sniffer. Observe that
> when the hosts move to state communication-interrupted they start
> sending TCP SYNs (definitely expected) - but around 3 minutes after
> traffic has been blocked, the last packet is sent between the hosts.



My configuration is very basic, modeled after a bunch of examples, and  
looks similar to yours.  I think really the only difference is that  
I'm using TCP port 520.

For us, our two servers sit on the same L2 network (same VLAN) same  
switch and are only two switch ports apart (probably on the same ASIC  
even!).

We're using Intel GigE server NIC's via the "em" driver on FreeBSD6.

Our load is very low compared to others.  We have only two different  
client subnets of about 250 hosts each.

In the packet captures I posted this week there is no idle time, no  
loss of packets, no extended timeout, and the traces show that ALL  
packets arrive on both sides of the connection.  Our primary server  
just thinks the secondary has gone away and shortly after (only  
seconds) the secondary thinks the same.  They go inter interrupted and  
then never send another packet between each other again.

Other than the failover interrupted, nothing useful really logs.

The servers start to run independently of each other for a while, but  
eventually everything starts to get out of wack because I'l have two  
servers running independently of each other serving the same pools,  
etc.  Things get crazy.

If I restart one or both servers (I think), or maybe it's just the  
secondary, things resync and things are fine again for either days or  
weeks, then it happens again.

-- 
Robert Blayzor, BOFH
INOC, LLC
rblayzor at inoc.net
http://www.inoc.net/~rblayzor/






More information about the dhcp-users mailing list