dhcpd lease problems?

David W. Hankins David_Hankins at isc.org
Wed Jun 7 16:26:54 UTC 2006


On Tue, Jun 06, 2006 at 03:45:15PM -0700, Alan DeKok wrote:
>    Our analysis using "callgrind" indicates that 90% or more of start 
> time is new_address_range().  Most of that is the checks for overlapping 
> ranges, by looking up each newly allocated lease in the 
> lease_ip_addr_hash.  Given the fixed size of the hash table, the result 
> is pretty much O(N^2) in the number of managed leases.
> 
>    Hence the long startup time.

I would presume this was a virgin system with either zero or the
vast majority of configured leases not appearing in dhcpd.leases,
and a number of leases selected in vast majority of the compile-time
hash table sizes.

The operational reality 99.9% of the time is the converse.

Hash table distribuion on 3.0.x is also substantially different than
what's currently on the 3.1.x HEAD branch (3.0.x treats the IPv4
address as a string - accumulates each byte in 1-bit left-shift
increments and adds in the value:

	while (i--) {
		/* Add the character in... */
		accum = (accum << 1) + *s++;

		/* Add carry back in... */
		while (accum > 65535) {
			accum = (accum & 65535) + (accum >> 16);
		}
	}
	return accum % size;

Consider then that 0.0.0.0 hashes to bucket numbered (0 % 9973), and
255.255.255.255 hashes to (3825 % 9973)).  This is ludicrously
different than current HEAD code, which treats the 32-bit IP address as
a number that can be directly applied to modulus (so the range is
now ([0..UINT_MAX] % 9973), effectively capped at 2^32-1).

And 3.0.x doesn't even support hash table sizes exceeding 65535,
as you can see from the accumulator.  3.1.x is halfway to removing
that limitation - the integer based hash tables are not limited except
by unsigned size, the string based hash tables are probably going to be
tackled by my cohort Shane in his copious spare time.

>    If, instead, the leases are allocated as needed, DHCPOFFER's would 
> down a bit.  But since the server is already limited by sync's and 
> syslog, the extra call to lease_allocate() wouldn't matter much.

Until we get rid of the synchronization limitation, which is so near on
my plate I think I'm going to have to finish it before I start on my
vegetables (asynchronous ddns updates), and switch to a more bind9-like
if not identical logging infrastructure.

And no, we're not going to do that by removing the sync.

>    Once the leases are allocated as needed, rather than in one big 
> block,  free'ing leases becomes trivial.  The contiguous allocation & 
> initialization is causing the slow startup time, *and* the inability to 
> free leases.

This trades the current bet - use CPU at startup time to use less during
runtime - to the opposite.

This is a bet you can't win in DHCPv6 for example...the number of
leases you would have to enumerate would destroy you, and you would
never actually use nearly as many leases as you enumerated.

But in DHCPv4, it's a safe bet.  99.9% of ISC DHCP uses is by people who
only have barely enough address space to cover their needs (typical pool
utilizations run 80-90%, and there's sufficient visitation churn that
old entries ultimately bring consumption to 100%).  So even if you held
that bet, it would be proved wrong by the time it becomes most important
to perform best.

Note that this also substantially shifts where you think the startup
problems lie.

Note that to be failover protocol compliant, both servers even on
virgin pools would immediately have to enumerate half of the leases
as soon as they connected.

This list just goes on and on, I gave up long ago trying to solve it.


I also find it curious that you've accurately indicated a problem in
hash table sizing - and suggested a cure for hash table sizing errors
is to do something other than increase the size of the hash table (or
allow it to be right-sized), or even to address limitations in the
hashing functions that are causing it to use no more than 3825 buckets.


> > There is simply no compelling reason for this complexity, considering
> > that reserved dynamic lease are not only what's being asked for, but
> > are already completed.
> 
>    I'm not clear on the difference between the current hosts, and 
> reserved dynamic leases.  Is it just that the reserved dynamic leases 
> inherit the pool & range properties, where hosts don't?

This has been accurately answered by Simon.

There's an additional benefit here which is failover protocol compliance
(host records sit outside of failover, so deleting a failover-controlled
lease that is duplicated by a host statement is fairly astonishing, and
certainly violates the failover draft).

-- 
David W. Hankins		"If you don't do it right the first time,
Software Engineer			you'll just have to do it again."
Internet Systems Consortium, Inc.		-- Jack T. Hankins


More information about the dhcp-users mailing list