Watching performance on a DHCP Server

Tue Feb 12 05:16:49 UTC 2008

We use a variety of lease times:
- 3 days for our PPPoA/PPPoE/BWA CPE/cable modem CPE
- 3 days for our cable modems
- 2 weeks for our eMTAs
- one month for our IP TV STBs

Frank

  _____  

From: dhcp-users-bounce at isc.org [mailto:dhcp-users-bounce at isc.org] On Behalf
Of Blake Hudson
Sent: Monday, February 11, 2008 9:51 AM
To: dhcp-users at isc.org
Subject: Re: Watching performance on a DHCP Server

Thanks Barr, it is always interesting to hear relative practical
experiences. This is exactly the kind of problem I would like to
prepare/plan for. I've read that Microsoft defaults to an 8 day lease time.
ISC uses a default lease time of 10 minutes, with a max of 2 hours in their
sample config included with 4.1.x.

We have successfully used 1 day leases in the past. Though I know some
larger ISPs use 5 day, 7 day or even longer lease times.

I'm assuming that the main advantage to a short lease time is that hosts
that join and leave a network give their leases up more rapidly (keeping IP
pool usage as low as possible). The main advantage to longer lease times
being load on the DHCP server. If I have a relatively stable network (only
known macs are allowed) then it seems like a longer lease time (say 7-14
days) is more appropriate. And on a relatively stable cable or DSL network
anything between 5-7 days seems acceptable? Volatile networks (wifi
hotspots?) would probably benefit from a 1 hour or shorter lease time.

Does it sound like I am in the right ballpark with these figures?

-Blake

-------- Original Message  --------
Subject: Re: Watching performance on a DHCP Server
From: Barr Hibbs  <mailto:rbhibbs at pacbell.net> <rbhibbs at pacbell.net>
To: dhcp-users at isc.org
Date: Sunday, February 10, 2008 4:35:37 PM

this experience is with a derivative of version 2 of the

server, but as the basic functionality has not changed

significantly for IPv4, it may be instructive....

at the time, our environment had about 12,000 clients split

roughly 55/45 between two servers...  each server was

connected by two links to each of approximately 120 remote

subnets, each link diversely routed to minimize disruption

due to network problems, but also delivering 2 copies of

every client message to the servers

we suffered a massive regional power failure that lasted

2-1/2 days before complete restoration...  our clients

received 7-day leases, largely grouped with their renewal

times between 8 am and 6 pm, so in a 2-1/2 day outage, we

could expect renewal requests to come from about half of our

clients, and certainly init-reboot requests to come from

all...  so, that is roughly 18,000 requests to be serviced

as power is restored....

of course, the power restoral didn't occur all at once, but

was somewhat randomly distributed over a period of roughly

32 hours

entirely by coincidence, we had instrumented the server to

capture detailed message arrival rates and response times,

expecting a normal, boring weekend...  but then the power

failed, and...  we got lots more data than we expected!

the real-time clock on our computers was capable of only 1

millisecond resolution, so I must extrapolate....  our

servers survived a nearly CONTINUOUS load of more than 1,000

requests per second for 32 hours...

of course, your mileage may vary, but by choosing an

appropriate lease lifetime, you will probably see similar or

better performance.

--Barr Hibbs

-----Original Message-----

From: dhcp-users-bounce at isc.org

[mailto:dhcp-users-bounce at isc.org]On

Behalf Of David W. Hankins

Sent: Friday, February 08, 2008 08:55

To: dhcp-users at isc.org

Subject: Re: Watching performance on a DHCP Server

On Thu, Feb 07, 2008 at 06:07:51PM -0600, Blake

Hudson wrote:

By default in my distribution the leases file

is stored in

/var/lib/dhcpd/dhcpd.leases. This happens to be

on a RAID1 array with

15k scsi disks and iostat shows the array as

being maxed out once it

reaches ~ 300 I/O's per second. DHCP logging is

done asynchronously to

the same array (which normally experiences ~ 50

I/O ops). With CPU and

memory barely breaking a sweat, this leads me

to believe that the

limitation is with the disks (lots of tiny writes).

I could move the leases file to a different

array, or to tmpfs, but

before I do I just want to know if these

results are typical and that I

have interpreted the test data correctly and

made the correct

determination as to the bottleneck.

those results are typical for that kind of

hardware, and you have

interpreted the test data correctly: fsync() is

the biggest

bottleneck.

in 4.1.0a1, you will find a feature, however,

which was provided to

us in a patch by Christof Chen.  it permits the

server to queue

multiple ACKs behind a single fsync(); default 28

(576 byte DHCP

packets filling default socket buffer send

sizes).  the burst of acks

are sent presently if the sockets go dry, and

shortly will be backed

up with a sub-second timeout.

it has some bugs we're working on, particularly

with failover, but

we'll address those in alpha.

you may find that it provides some form of

multiplicative benefit to

your performance stats, since fsync() is the

bottleneck, and now there

are 28 acks per fsync max.

so if you are only pushing 50 requests/s

currently, you may live

comfortably in a 250 request/s buffer for some

months until the

4.1.x code is stable?

Also, I would appreciate any anecdotal evidence

with regards to how many

requests are typical in a large network under

normal (or abnormal)

conditions. If 10,000 users all of a sudden

came online, how many

requests would they really generate per second?

there have been a few folks who suffered mass

power outages, i don't

know what search query to use, but you can find

them on the old

dhcp-server mailing list.  they did not report

problems, rather the

surprise at the lack of problem.

--

Ash bugud-gul durbatuluk agh burzum-ishi krimpatul.

Why settle for the lesser evil?

https://secure.isc.org/store/t-shirt/

--

David W. Hankins	"If you don't do it right the first time,

Software Engineer		     you'll just have to do it again."

Internet Systems Consortium, Inc.		-- Jack T. Hankins

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20080211/e7da1504/attachment.html>