3.1.1 Failover Speed (was: Re: 4.1.0a1 Failover sync speed)

Chuck Anderson cra at WPI.EDU
Wed May 28 21:22:17 UTC 2008


On Wed, May 28, 2008 at 03:56:31PM -0400, Michael Kaegler wrote:
> 3.1.1 with out configuration takes 2.5 hours.
> 4.0.0 takes 1.5 hours.
> 4.1.0a1 takes 4 hours.
>
> These are all way too long for us to keep dhcp unavailable for (for the 
> initial sync). I'm also worried about working problems and having to 
> restart this 4-hour process as part of the debug steps!

Are the clocks synchronized with NTP?  What are your configuration 
file contents?  At least post the failover stanzas and the subnet/pool 
stanzas.

> We do have 140 subnets (mostly /21s). The load averages on the test servers 
> were high (4.00 - 6.00) but the interfaces were only pushing 60kbit.

140 * 2048 = 286,720.  I'm not sure what the expected performance 
should be for synchronizing almost 300,000 leases over failover.

> Chuck Anderson asked if we had excluded all the fixed address assignments 
> (before we had tested with a version other than 4.1.0a1) and the answer is 
> 'no'. The subnets have pool ranges which span most of the (/21) subnet. 
> There are fixed-address lines within those ranges which will be removed 
> (but can't be yet, we still need them).

You have a configuration error.  You must not have fixed-addresses 
that are within ranges.  If you must have these fixed-addresses 
reserved, then you must explicity skip them in any range statements 
you have.  E.g. if you have 10.0.0.10 as a fixed-address, you must 
make ranges like this:

pool {
	range 10.0.0.1 10.0.0.9;
	range 10.0.0.11 10.0.0.254;
}

If you don't do this, you will confuse the dhcp server which will try 
to assign dynamic IPs that match your fixed-address, causing an IP 
conflict.

> Why is this not faster? Whats happening other than transferring data? Can I 
> speed up the initial sync somehow?

Others will have to chime in about the expected performance numbers, 
since I don't have experience with that many pool addresses.  We have 
about 9,000 total addresses in range statements and are using failover 
without problems or long startup times.  Our servers take mere seconds 
to restart.


More information about the dhcp-users mailing list