<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=UTF-8" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

-------- Original Message  --------<br>

Subject: Re: Watching performance on a DHCP Server<br>

From: David W. Hankins <a class="moz-txt-link-rfc2396E" href="mailto:David_Hankins@isc.org"><David_Hankins@isc.org></a><br>

To: <a class="moz-txt-link-abbreviated" href="mailto:dhcp-users@isc.org">dhcp-users@isc.org</a><br>

Date: Friday, February 08, 2008 10:55:14 AM<br>

<blockquote cite="mid:20080208165514.GB26411@isc.org" type="cite">

  <pre wrap="">On Thu, Feb 07, 2008 at 06:07:51PM -0600, Blake Hudson wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">By default in my distribution the leases file is stored in 

/var/lib/dhcpd/dhcpd.leases. This happens to be on a RAID1 array with 

15k scsi disks and iostat shows the array as being maxed out once it 

reaches ~ 300 I/O's per second. DHCP logging is done asynchronously to 

the same array (which normally experiences ~ 50 I/O ops). With CPU and 

memory barely breaking a sweat, this leads me to believe that the 

limitation is with the disks (lots of tiny writes).

I could move the leases file to a different array, or to tmpfs, but 

before I do I just want to know if these results are typical and that I 

have interpreted the test data correctly and made the correct 

determination as to the bottleneck.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

those results are typical for that kind of hardware, and you have

interpreted the test data correctly: fsync() is the biggest

bottleneck.

in 4.1.0a1, you will find a feature, however, which was provided to

us in a patch by Christof Chen.  it permits the server to queue

multiple ACKs behind a single fsync(); default 28 (576 byte DHCP

packets filling default socket buffer send sizes).  the burst of acks

are sent presently if the sockets go dry, and shortly will be backed

up with a sub-second timeout.

it has some bugs we're working on, particularly with failover, but

we'll address those in alpha.

you may find that it provides some form of multiplicative benefit to

your performance stats, since fsync() is the bottleneck, and now there

are 28 acks per fsync max.

so if you are only pushing 50 requests/s currently, you may live

comfortably in a 250 request/s buffer for some months until the

4.1.x code is stable?

  </pre>

  <blockquote type="cite">

    <pre wrap="">Also, I would appreciate any anecdotal evidence with regards to how many 

requests are typical in a large network under normal (or abnormal) 

conditions. If 10,000 users all of a sudden came online, how many 

requests would they really generate per second?

    </pre>

  </blockquote>

  <pre wrap=""><!---->

there have been a few folks who suffered mass power outages, i don't

know what search query to use, but you can find them on the old

dhcp-server mailing list.  they did not report problems, rather the

surprise at the lack of problem.

  </pre>

</blockquote>

Thanks for the follow up David, I was actually just typing a response

to your 2006 post entitled "Synthetic benchmarks". Which confirmed the

results being typical and lead me to take another look at my disk

subsystem. I was able to increase speed dramatically by enabling the

write-back cache on this array. <br>

<br>

As I mentioned, these are beefy servers with redundant CPU, PSU,

battery backed up RAID, and redundant uninterrupted power sources.

They've never experienced any downtime of any kind since being put in

place 1 to 2 years ago. -Perhaps I was just a bit paranoid by using

write-through. But perhaps so is DHCP with its frequent fsync calls.<br>

<br>

Half of my hosts are manually assigned within the dhcpd config file,

which should speed up the process greatly, though I am going to be

adding dynamically assigned leases in the near future. dhcperf has

shown me that even if I am inundated with a deluge of requests, I

should be able to successfully answer several hundred each minute

leading to a relatively short period of convergence.<br>

<br>

Perhaps the folks that I referenced who had problems (1-3 days of

downtime), really just need a clue about how to properly configure

their servers.<br>

<br>

I'm glad that ISC is continuing development on DHCP, it seems like

major gains in performance can be accomplished relatively easily by

either combining writes or writing asynchronously. Although, I was a

bit surprised that the leases table is not kept (and managed) in

memory. And that since dhcpd currently relies so heavily on disk, that

it has not taken advantage of any file system or disk performance

enhancing features. I look forward to upgrading to 4.1, though I will

likely wait until it has been integrated into my distribution for

further testing.<br>

<br>

--Blake<br>

<br>

</body>

</html>