dhcpd lease problems?

Wed Jun 7 17:05:47 UTC 2006

David W. Hankins wrote:
 >> Given the fixed size of the hash table, the result
>>is pretty much O(N^2) in the number of managed leases.
>>
>>   Hence the long startup time.
> 
> I would presume this was a virgin system with either zero or the
> vast majority of configured leases not appearing in dhcpd.leases,
> and a number of leases selected in vast majority of the compile-time
> hash table sizes.

   For some tests, 10^6 managed leases, 10^4 active ones.

   In situations where the ranges are full, we haven't seen slow startup 
times be due to dhcpd.leases.  A second or two for a 1M file seems to be 
acceptable.  In contrast, new_address_range() was taking *minutes* in 
some cases.

   i.e. reading dhcpd.leases is fast enough to fit into the window of a 
client re-transmit, so the client doesn't notice a restart.  The O(N^2) 
behavior in new_address_range() means that clients will give up completely.

> And 3.0.x doesn't even support hash table sizes exceeding 65535,
> as you can see from the accumulator.  3.1.x is halfway to removing
> that limitation - the integer based hash tables are not limited except
> by unsigned size, the string based hash tables are probably going to be
> tackled by my cohort Shane in his copious spare time.

   http://www.freeradius.org/radiusd/src/lib/hash.c

   FNV is a fast & good hash, used in the above code.  The 
implementation given above is ~800 lines, including comments & test 
cases.  It scales to 10^6 or more hash table entries, it can handle any 
type of data, it's simple, and it's fast.

   License issues aside, it's worth looking at.

> But in DHCPv4, it's a safe bet.  99.9% of ISC DHCP uses is by people who
> only have barely enough address space to cover their needs (typical pool
> utilizations run 80-90%, and there's sufficient visitation churn that
> old entries ultimately bring consumption to 100%).  So even if you held
> that bet, it would be proved wrong by the time it becomes most important
> to perform best.
> 
> Note that this also substantially shifts where you think the startup
> problems lie.

   Are people really finding that reading dhcpd.leases is a problem on 
startup?  I'm a little surprised.

> Note that to be failover protocol compliant, both servers even on
> virgin pools would immediately have to enumerate half of the leases
> as soon as they connected.

   Yes, well.  Some optimizations work for a stand-alone server, but not 
for failover.  They can still be useful to some people, however.

> I also find it curious that you've accurately indicated a problem in
> hash table sizing - and suggested a cure for hash table sizing errors
> is to do something other than increase the size of the hash table (or
> allow it to be right-sized), or even to address limitations in the
> hashing functions that are causing it to use no more than 3825 buckets.

   I didn't say we avoided touching the hash tables. :)

> There's an additional benefit here which is failover protocol compliance
> (host records sit outside of failover, so deleting a failover-controlled
> lease that is duplicated by a host statement is fairly astonishing, and
> certainly violates the failover draft).

   Failover deals with leases, not with ranges.  So if a lease is 
removed from all lists & hash tables before failover starts, failover 
never notices.

   In other words, removing the lease at start time, before failover 
begins, is no different than manually editing dhcpd.conf to split the range.

   Alan DeKok.