selecttest tool

JINMEI Tatuya / 神明達哉 Jinmei_Tatuya at isc.org
Tue Aug 12 03:10:09 UTC 2008


At Mon, 11 Aug 2008 11:13:13 -0500,
Walter Gould <gouldwp at auburn.edu> wrote:

> Here's the latest in my DNS horror story...

[snip]

> I started named and watched 'rndc status'. After a few minutes this was 
> the result:
> # rndc status
> version: 9.5.0-P2 ()
> number of zones: 365
> debug level: 0
> xfers running: 0
> xfers deferred: 0
> soa queries in progress: 0
> query logging is ON
> recursive clients: 4980/9900/10000
> tcp clients: 0/100
> server is up and running
> 
> When the recursive clients reached this level (or shortly before) 
> queries started timing out...
> 
> /var/log/messages showed the all to familiar too many open sockets error:
> 
> Aug 11 10:34:17 dnsnew named[24266]: error: socket: too many open file 
> descriptors
> Aug 11 10:34:31 dnsnew last message repeated 1876 times
> 
> My questions are -
> 1. Do you think I should increase the FDSETSIZE to 10,000 or some other 
> crazily high number?
> 2. Is that excessive?

You could try this, and it *might* help, but if the server is handling
such a high number of recursive clients regularly, I'm afraid it will
just trigger another scalability problem.

> 3. What other adverse effects might this cause on my server?

In general, allowing a large number of concurrent open sockets will
make the server busier, roughly in proportion to the number of
sockets.

> 4. Am I the only one having problems with a) ISC patched BIND packages 
> and b) Red Hat patched BIND rpms?

I don't know the answer to this question, but your operational
environment seems to be extraordinary in some points:

- it's acting both as an authoritative and as a caching server
- as an authoritative server, it's managing a pretty large number of
  zones (which may require resource-consuming operations such as zone
  transfers)
- as a caching server, it seems to be handling a high volume of
  queries (several thousands concurrent clients)

While we've worked hard on P2 to make it as scalable as possible while
keeping it as conservative as possible, this environment may just
exceed the ability of the conservative implementation.

I know operators don't like a radical solution, but I'd really like
you to give beta version a try.  At least the next beta versions
(which will hopefully be released later this week or early next week)
should be much stable than the currently available ones, and should
not be as "radical" as you might think.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.


More information about the bind-users mailing list