Watching performance on a DHCP Server

Barr Hibbs rbhibbs at
Mon Feb 11 17:07:43 UTC 2008

in the case I reported, the clients were entirely within a single enterprise, and while it was certainly possible for NICs to be replaced from time to time, the client population was remarkably stable.  For us, a 7 day or even 31 day lease would have been appropriate.  Our users were instructed to shut down the clients every day at close of business, then to restart them the following business day, so we effectively had 100% of the clients doing an INIT-REBOOT at least once each business day, with well over 90% rebooting at 8:00 AM -- talk about a spike in network traffic!  Over time, with changing workload requirements, expansion of working shifts, and the realization that considerable time could be saved at the beginning of each shift (not just mornings any longer) by utilizing sleep mode for power saving, the in-rush of init-reboot requests dropped significantly.

There is one last point I forgot to mention in my previous response...  our modification of the ISC server updated the leases file for each and every message processed that modified the lease.  Our server was based on version 2, so there were no DNS updates as part of the lease assignment and renewal process.

Basically, the more volatile your client population, the shorter the lease time should be, though that is not an absolute.  Consider operational hours, predictions of network traffic, number of routers/relay agents and their placement, and typical use patterns of the clients before deciding.

I've never been a fan of permitting only known MAC addresses, as the daily maintenance of the server configuration in very large environments is a major pain, and what of NIC replacement without prior notice?  Just a few of my biases based on experience with programmable NICs, frequent moves, adds, and changes, and cheaply made NICs with high failure rates.

--Barr Hibbs

 -----Original Message-----
From: dhcp-users-bounce at [mailto:dhcp-users-bounce at]On Behalf Of Blake Hudson
Sent: Monday, February 11, 2008 07:51
To: dhcp-users at
Subject: Re: Watching performance on a DHCP Server

  Thanks Barr, it is always interesting to hear relative practical experiences. This is exactly the kind of problem I would like to prepare/plan for. I've read that Microsoft defaults to an 8 day lease time. ISC uses a default lease time of 10 minutes, with a max of 2 hours in their sample config included with 4.1.x.

  We have successfully used 1 day leases in the past. Though I know some larger ISPs use 5 day, 7 day or even longer lease times.

  I'm assuming that the main advantage to a short lease time is that hosts that join and leave a network give their leases up more rapidly (keeping IP pool usage as low as possible). The main advantage to longer lease times being load on the DHCP server. If I have a relatively stable network (only known macs are allowed) then it seems like a longer lease time (say 7-14 days) is more appropriate. And on a relatively stable cable or DSL network anything between 5-7 days seems acceptable? Volatile networks (wifi hotspots?) would probably benefit from a 1 hour or shorter lease time.

  Does it sound like I am in the right ballpark with these figures?


  -------- Original Message  --------
  Subject: Re: Watching performance on a DHCP Server
  From: Barr Hibbs <rbhibbs at>
  To: dhcp-users at
  Date: Sunday, February 10, 2008 4:35:37 PM

this experience is with a derivative of version 2 of the
server, but as the basic functionality has not changed
significantly for IPv4, it may be instructive....

at the time, our environment had about 12,000 clients split
roughly 55/45 between two servers...  each server was
connected by two links to each of approximately 120 remote
subnets, each link diversely routed to minimize disruption
due to network problems, but also delivering 2 copies of
every client message to the servers

we suffered a massive regional power failure that lasted
2-1/2 days before complete restoration...  our clients
received 7-day leases, largely grouped with their renewal
times between 8 am and 6 pm, so in a 2-1/2 day outage, we
could expect renewal requests to come from about half of our
clients, and certainly init-reboot requests to come from
all...  so, that is roughly 18,000 requests to be serviced
as power is restored....

of course, the power restoral didn't occur all at once, but
was somewhat randomly distributed over a period of roughly
32 hours

entirely by coincidence, we had instrumented the server to
capture detailed message arrival rates and response times,
expecting a normal, boring weekend...  but then the power
failed, and...  we got lots more data than we expected!

the real-time clock on our computers was capable of only 1
millisecond resolution, so I must extrapolate....  our
servers survived a nearly CONTINUOUS load of more than 1,000
requests per second for 32 hours...

of course, your mileage may vary, but by choosing an
appropriate lease lifetime, you will probably see similar or
better performance.

--Barr Hibbs

  -----Original Message-----
From: dhcp-users-bounce at
[mailto:dhcp-users-bounce at]On
Behalf Of David W. Hankins
Sent: Friday, February 08, 2008 08:55
To: dhcp-users at
Subject: Re: Watching performance on a DHCP Server

On Thu, Feb 07, 2008 at 06:07:51PM -0600, Blake
Hudson wrote:
    By default in my distribution the leases file
      is stored in
    /var/lib/dhcpd/dhcpd.leases. This happens to be
      on a RAID1 array with
    15k scsi disks and iostat shows the array as
      being maxed out once it
    reaches ~ 300 I/O's per second. DHCP logging is
      done asynchronously to
    the same array (which normally experiences ~ 50
      I/O ops). With CPU and
    memory barely breaking a sweat, this leads me
      to believe that the
    limitation is with the disks (lots of tiny writes).

I could move the leases file to a different
      array, or to tmpfs, but
    before I do I just want to know if these
      results are typical and that I
    have interpreted the test data correctly and
      made the correct
    determination as to the bottleneck.
      those results are typical for that kind of
hardware, and you have
interpreted the test data correctly: fsync() is
the biggest

in 4.1.0a1, you will find a feature, however,
which was provided to
us in a patch by Christof Chen.  it permits the
server to queue
multiple ACKs behind a single fsync(); default 28
(576 byte DHCP
packets filling default socket buffer send
sizes).  the burst of acks
are sent presently if the sockets go dry, and
shortly will be backed
up with a sub-second timeout.

it has some bugs we're working on, particularly
with failover, but
we'll address those in alpha.

you may find that it provides some form of
multiplicative benefit to
your performance stats, since fsync() is the
bottleneck, and now there
are 28 acks per fsync max.

so if you are only pushing 50 requests/s
currently, you may live
comfortably in a 250 request/s buffer for some
months until the
4.1.x code is stable?

    Also, I would appreciate any anecdotal evidence
      with regards to how many
    requests are typical in a large network under
      normal (or abnormal)
    conditions. If 10,000 users all of a sudden
      came online, how many
    requests would they really generate per second?
      there have been a few folks who suffered mass
power outages, i don't
know what search query to use, but you can find
them on the old
dhcp-server mailing list.  they did not report
problems, rather the
surprise at the lack of problem.

Ash bugud-gul durbatuluk agh burzum-ishi krimpatul.
Why settle for the lesser evil?
David W. Hankins	"If you don't do it right the first time,
Software Engineer		     you'll just have to do it again."
Internet Systems Consortium, Inc.		-- Jack T. Hankins

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the dhcp-users mailing list