Watching performance on a DHCP Server

Larry Apolonio isc-dhcp at rh73.com
Wed Nov 14 21:28:25 UTC 2007


David W. Hankins wrote:

>On Wed, Nov 14, 2007 at 11:01:04AM +0100, Shane Kerr wrote:
>  
>
>>Which version of DHCP were you using? We noticed a *massive* improvement in load
>>time using 3.1 over 3.0, and a further improvement with 4.0 over 3.1.
>>    
>>
>
>I have a suspicion that the measurement on 4.0 was impacted by some
>external event; it got more CPU because something else wasn't running
>or the data was cached in memory or something.
>
>
>I don't think we changed anything to improve DHCPv4 loading
>performance in 4.0, except that it carries everything from 3.1.x.
>
>There were three CPU performance improvements in 3.1.x work;
>
>                        Changes since 3.1.0a1
>
>- Some default hash table sizes were tweaked, some upwards, some downwards.
>  3.1.0a1's tables resulted in a reduction in default server memory use.
>  The new selected values provide more of a zero sum (increasing the size
>  of tables likely to be populated, decreasing the size of tables unlikely).
>
>- Lease structures appear in three separate hashes: by IP address, by UID,
>  and by hardware address.  One type of table was used for all three, and
>  improvements to IP address hashing were applied to all three (so UID and
>  hardware addresses were treated like 4-byte integers).  There are now two
>  types of tables, and the uid/hw hashes use functions more appropriate
>  to their needs.
>
>                        Changes since 3.0 (New Features)
>
>- Some patches to improve DHCP Server startup speed from Andrew Matheson
>  have been incorporated.
>
>
>The last only affects loading time related to loading a dhcpd.leases
>file (insertion-sorting leases from dhcpd.leases into memory was "most
>pessimal").
>
>The middle one, the hash algorithm for IPv4 addresses, only effects
>IPv4 address hashing.  We're still using the old hash function for
>client identifiers and hardware addresses.  The result is that the
>hardware/client-id hash values may only fit in a 16 bit area, so
>however big you make your hash tables, we can only use 65536 buckets,
>and there is some predisposition away from a zero value, so there are
>some known collisions (made worse by 'frequently similar' identifiers,
>seeing as the first 3 bytes of mac addresses are often the same for
>many clients).
>
>I toyed with a few other hash functions for these other purposes
>during the 3.1.x work primarily to try and exceed the 64k mark but
>also to see if we could get something with fewer collisions, but
>wasn't able to find one that worked well within our development
>schedule, and ultimately had to queue this for later.
>
>The first one is a another halfway mark in finishing these performance
>concerns.  The main feature here is that we can now set hash table
>sizes on a table-by-table basis rather than having a 'one size fits
>all' philosphy (9,973 buckets for the 'DHCP Option Codes' hash
>table!).  The next step is to make the sizing dynamic as entries are
>added (or by counting what will be placed into it before allocation).
>
>  
>
So I'm trying to understand this a little better, I think I'm getting 
there.  So the tool I have generates a bogus dhcp leases file.  The 
first two bytes of the mac address is 00:00 the remaining 4 is based on 
the IP address.  An IP address of 10.10.10.10 would get a mac of 
00:00:0a:0a:0a:0a.  Although I have used non rfc-1918 IP addresses, some 
of the first octets are the same.  In recent tests I have reverted back 
to a 10.x.x.x range so now all macs start with 00:00:00.

Here's some weird data I hope someone can interpret for me.  I am using 
10.x.x.x IP addresses here.
In these 5 cases I have a configuration file that has 70 subnets with 
285460 total available IP addresses to lease.
I used
27d179a3c3fbef576566b456a1168246  dhcp-3.1.0.tar.gz
I created a leases file that takes up 80% of those leases

Here's what I cam up with
LEASE_HASH_SIZE         Startup time(real)
10007   4m38.148s
50023   15m33.150s
100003 (default)   21m29.172s
500009   0m12.338s
1000003   0m12.329s

Making this thing scale up is such a trick.  In the real world we don't 
see this type of startup time because the leases file and amount of IP 
addresses has not grown that large.... yet.

Also there is a double whammy here
As the capacity grows, the server will be handling more queries per 
second be the fact that there are more clients to make DHCP requests, at 
the same time, the DHCP server slows down to handle less queries per second.

So I'm trying to find that sweet spot to balance startup time, 
performance, and capacity.  At what point do we have to add another DHCP 
server? 

BTW, what would you say would be typical of DHCP performance for a 
machine that handles 250k IP addresses

Does 130 4-way handshakes or 600 2-way handshakes per second sound right 
to you? (Based on the dhcperf tool)

Larry Apolonio



More information about the dhcp-users mailing list