<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=UTF-8" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

-------- Original Message  --------<br>

Subject: Re: Watching performance on a DHCP Server<br>

From: <a class="moz-txt-link-abbreviated" href="mailto:sthaug@nethelp.no">sthaug@nethelp.no</a><br>

To: <a class="moz-txt-link-abbreviated" href="mailto:dhcp-users@isc.org">dhcp-users@isc.org</a><br>

Date: Tuesday, February 12, 2008 3:20:50 PM<br>

<blockquote cite="mid:20080212.222050.41640557.sthaug@nethelp.no"

 type="cite">

  <blockquote type="cite">

    <pre wrap="">Th numbers I'm quoting are performance figures from the dhcperf testing 

tool. If your 100k users were to be down 24 hours (due to regional power 

failure, massive server/network failure), how would your DHCP server 

cope with 100K requests at once? This thread is not about 'normal' 

conditions, it is about the extreme cases that administrators (at least 

good ones) are expected to design for and accommodate ahead of time.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

I'm planning for the extreme cases by having a redundant server (RAID,

redundant power etc), and soon DHCP failover. I'm *not* planning for

our DHCP infrastructure to cope with 100K requests at once, since I

find such a scenario highly unlikely (even considering the failures

that you have mentioned above).

Steinar Haug, Nethelp consulting, <a class="moz-txt-link-abbreviated" href="mailto:sthaug@nethelp.no">sthaug@nethelp.no</a>

  </pre>

</blockquote>

<br>

I thought they were unlikely too, but I am choosing to plan for it

since I've seen two ISP's in the last month who have let 100% of their

lease expire. Their user bases are not 100k, but their

servers/infrastructure couldn't cope with the requests and it resulted

in hours or days of unnecessary downtime for their users. In these

instances the problems were likely 'administrator error', with either

poor planning or configuration.<br>

<br>

I'm glad that we have never experienced an issue of this magnitude, and

while administrator error can be mostly be reduced through redundancy,

planning, etc, there are some things that lie outside of our control.

If something should happen, I'd like to be prepared.<br>

<br>

I'm preparing by testing our equipment and configuration in order to

confidently state what the server's limits are and to be able to

provide information which supports my claim. I felt the performance of

DHCPD 3.x was too slow. By removing the primary bottleneck in DHCPD

(high numbers of fsyncs) I have been able to increase the server's

capacity ~3x on 4 way transactions, and ~100x on 2 way transactions

while putting less load on a server that provides other needed

services. I feel confident that if there were a large scale outage, our

DHCP server would not be overwhelmed, DHCP convergence would not be a

limiting factor, and customer downtime will have been minimized by

these efforts.<br>

<br>

-Blake<br>

</body>

</html>