David W. Hankins
David_Hankins at isc.org
Wed Jul 19 20:08:40 UTC 2006
Someone asked me recently about DHCP server performance and how
to tune a large installation. I thought my findings might be of
some more general use (or might spawn some discussion).
Naturally the first thing I thought of was disk performance: how
many fsync()s per second, and the usual advice about prepending
a "-" to your log file names in /etc/syslog.conf on the Linuxes,
reducing drive activity, etc. The idea being not to use a
regular local disk, but rather a more capable write-back array
or even a solid-state disk on a battery.
He wanted numbers.
I happened to have a dual opteron system with 4GB of RAM and
a nice (old) 10-spindle raid array whose PCI controller uses a
battery backed-up cache (so it is write-back).
I don't have a solid state disk, so I manufactured a RAMdisk
as a very rough over-approximation.
Some datapoints, then. First, fsyncs per second (after writing
a fairly average sized lease structure first, every entry in one
of the lease databases one of you users has donated to me over
Laptop Hard Drive: 40
Mylex ExtremeRAID (DAC960 derivative) w/battery: 400
That doesn't sync up with the "4-7 leases per second" benchmark
we hear off and on when someone visits us to report that ISC
DHCP severely underpforms Windows' (which does not sync its
lease db before answering clients).
So I created a very synthetic benchmark: I altered the DHCP IO
subsystem to pass down a manufactured DHCP DISCOVER and REQUEST
packet that I had foreknowledge the server would answer a
specific way. Infinite looped it, then counted replies. The
result is an infinite DISCOVER/OFFER/REQUEST/ACK, and all of
the server's code is excercised (minus the read() to pull the
packet off the socket - one less system call).
This was easier at that moment than constructing a daemon to
spam dhcp client requests.
It should also be noted I disabled ping-check (because of the
infinite loop, the timeout queue would never be processed).
Skipping the laptop hard drive, which I think we know will
do the 4-8 that's been reported in the past, I went straight
for the high performance subsystems. (Plus, this benchmark
isn't very usable on the laptop (packet flooding the broadcast
Mylex ExtremeRAID: 160 packets per second (in+out)
RAMdisk: 10,000 packets per second (in+out)
That's packets per second, so you'd have to halve the numbers
to get queries/responses numbers, then halve again to get
offer vs. ack (1 out of every 4 packets is one of these).
So that's 40 offers and 40 acks, 2,500 offers and 2,500 acks
respectively, per second.
Removing the DISCOVERs and going straight for the REQUEST's (I
knew this lease was active in the db, so I knew it would work):
Mylex ExtremeRAID: 85 packets per second (in+out)
RAMdisk: 8,500 packets per second (in+out)
That's curious. I expected ~80 on the RAID array. Fsync() is
the limiting factor after all, and we do that only on the
requests, so I'd expect to get half the previous number. But
8,500 is not half of 10,000...that's odd.
Going back to compare fsync()s/s with requests/s:
Laptop hard drive: 40 4-7?
Mylex ExtremeRAID: 400 40-42
RAMdisk A: 34,966 2,500 (discover/offer/request/ack)
RAMdisk B: 34,966 4,250 (request/ack)
So actual performance is (very) approximately 10% of the fsync()
rate, until you get into the very high number of leases per
second, at which point it is coupled at an even higher 12.2%.
If you are only serving clients in INIT state (or assuming some
worst case), it's lower at 7.2%. I'm guessing CPU starvation
is the cause, but I haven't had the time to repeat the test and
I wouldn't have expected there to be a curve there (or if there
was a curve, I expected it to go only down, as other non-fsync
related limitations took over as the new bottlenecks and faster
fsyncs mattered less and less).
But both of those are still smaller numbers than I'd previously
Anyway, I thought the above might be interesting to some users,
as at least a guideline in the absence of any true testing.
David W. Hankins "If you don't do it right the first time,
Software Engineer you'll just have to do it again."
Internet Systems Consortium, Inc. -- Jack T. Hankins
More information about the dhcp-users