DNS server caching performance test results.

Matt Simerson mpsimerson at hostpro.com
Thu May 17 00:12:18 UTC 2001


DNS caching server load testing.

This testing is designed to determine how caching name servers perform under
a variety of network conditions. Some tests are against a dns wall and
should directly reflect the ability of the cache to lookup and answer
queries. Other tests are run against primed caches reflecting a dns servers
ability to retrieve cached data. Still other tests are run against "real
world" name servers forcing the caches to deal with latency, failures, and
timeouts.

Test 1: Raw lookup performance& raw cache performance.

This test requires the dns server to query a dnswall running on a seperate
host. The dnswall is a simple program named "dnswall" that generates
matching forward and reverse dns answers for blocks of IP space. The test
setup looks like this:

Role       CPU          RAM    Program       Used RAM    DISK       LIMITS
LOG 
dns wall   PIII 550     1GB    walldns 1.05   900k      RAID 5
                                                        LVD SCSI   -d250000
status only
 
dns cache  PIII 600     512MB  dnscache 1.05  290MB     RAID 5     -o 5000*
status only
                               BIND 8.2.3     8MB       LVD SCSI   unlimit
none
                               BIND 9.1.1     12MB                 unlimit
none  
  
dns client 2xPIII 850   1GB    dnsfilter-1.05 <5MB      RAID 5     -c 10
full
                                                        LVD SCSI   -c 100 
 
*MAXUDP compiled with value of 2000.

Configuring walldns is simplicy itself, install the software and start it
up. It listens on port 53 just like a standard dns server but it answers
queries by making up a hostname or IP. For example, I send it a query for
216.122.0.1 and it makes up a hostname of "1.0.122.216.in-addr.arpa" and a
TTL of one week. Do a lookup on that hostname and of course it'll resolve to
that IP. It's very convenient and as testing will show, it's very fast too.
:-)

Configuring the caches was pretty basic. Run unlimit for BIND 8 and BIND 9
and add a couple lines to the options section (forward only; forwarders {
216.122.x.x; };) to convince it to only query the dns wall. Dnscache was
also easy to configure: (echo "216.122.x.x" >
~dnscache/root/servers/122.216.in-addr.arpa.). That tells dnscache to
forward all requests for the 216.122 network off to dnswall. 

The client configuration turned out to be quite tricky. I've looked at quite
a few different test programs for dns. Netperf3 is supposed to be a good one
but I've had no luck getting it working on FreeBSD and I'm not patient
enough to keep fiddling with it. I've also played a bit with the Net::DNS
perl modules and the author supplied mresolv and mresolv2 but none of the
perl "dns testers" could generate a meaningful amount of load. I was left
back where I started with dnsfilter. 

Dnsfilter is a C program supplied with djbdns that takes a list of IP's and
does lookups on them. It writes the output back to STDOUT and I piped all
the output to files to verify the accuracy of the results. After much
testing of dnsfilter and it's limitations, I deduced that setting it's
number of parallel lookups higher than 100 effectively chokes it after
around 12,000 very quick queries. Keeping the number low prevented that. I
ran most tests at the default value of 10 paralled lookups unless otherwise
noted. The only reason to use a -c value higher than 100 is when querying
real world data where you need lots more parallel lookups because you'll
have a high number of time-outs or other failures.

What follows is the output of my first batch of tests. I ran the following
command 3 times for each dns cache: "time dnsfilter < iplist.wall >
out[1-3]". The first test reflects the caches need to fetch the results from
the dns wall and return them to the client. The two subsequent tests reflect
the caches ability to server results from it's cache. The file iplist.wall
simply contains 65,536 ip addresses representing the class B address of
216.122.0.0. 

Name Server               time(s)   qps 
dnscache - 290MB RAM      63        1040 
                          29        2260 
                          29        2260 
      
BIND 8 - 8MB RAM          21        3120 
                          40        1638 
                          39        1680 
      
BIND 9 - 12MB RAM         81        809 
                          29        2260 
                          29        2260 

Memory usage isn't meaningful for dnscache as it's a startup parameter. You
tell it how big a cache you want to maintain and once it's full it throws
out the oldest entries. I consider that to be better than allowing your
cache to grow until it exhausts all your physical RAM and swap (which I do
later :-)). Between the BINDs, version 8 starts out with 2MB, version 9
starts out with 4MB. After the 65,536 queries, v9 has grown by 8MB where v8
has only increased by 6MB. Apparently v8 is more memory efficient in how it
stores cached data.

I went back and retested these runs a couple times because the results just
didn't seem right. In every case, all three dns caches resolved all sixty
five thousand IP addresses correctly. What I found to be the most odd was
BIND 8 was able to serve the results faster when it didn't cache them. :-|
That little revelation I found to be quite surprising. What it does end up
proving is that BIND 9 and dnscache both have a faster cache storage/lookup
algorithm. v8 was the fastest at resolving uncached queries and v9 was the
slowest. 

The next thing I did was to spread out the client loads. I did this by
splitting the file "iplist.wall" into three equal sized chunks and copying
them to three servers with dnsfilter installed (hardware specs the same for
all dnsclient servers). So, each dns client would be responsible for looking
up about twenty thousand IP addresses. I then executed the following command
on all three servers at the same time: "time dnsfilter <
/usr/iplist.wall.seg > /usr/out[1-3]". Client time is the combined time
spent by all three clients looking up data. Time is elapsed time taken to
run the test (in seconds). Here are the results:

Name Server                client-time    time      qps 
    dnscache - 290MB RAM   200             67       981 
                           93              31       2112 
                           86              29       2286 
        
    BIND 8 - 8MB RAM       51              17       3855 
                           114             38       1725 
                           114             38       1725 
        
    BIND 9 - 12MB RAM      239             80       822
                           82              27       2397 
                           81              27       2427 

I've ran the tests a couple more times and got similar results. I'm fairly
confident that I've reached the maximum abilities of each dns server on the
current hardware. I'm also quite confident that the testing is yielding
accurate results.

I'm getting to do another battery of tests against our production servers
resolving the entire Class B. I believe the results of that testing are also
quite valuable as it determines how the dns server deals with timeouts and
lookup failures. 

Matt 



More information about the bind-users mailing list