bind-9.3.2 / CPU issue.

mbrandeis at 013barak.net.il mbrandeis at 013barak.net.il
Mon Aug 21 08:59:57 UTC 2006


Hello all, 

Just wanted to inform you we've had the same problem described here for half a year.

When replacing the daemon with djbdns (dnscache) the problem has gone away permanently.

Right now i've had to degrade to bind 8.4.7 to avoid this problem.

After digging into the texts, I found out the assumption is this problem is directly related to the cache size.

once the cache grows over size X, most of the cpu goes to *time* / *local-time* calls (use strace/truss to check that).

the probable reason is that the server is performing tons of timestamp checks just to determine if entries in the cache have expired or not.

so, after Z time the cache grows over size X and the load of keeping it up-to-date and cross the ttl with the current time costs too much resources.

I dont know how true is that. but thats the best theory I've heard so far.

As I said, I just degraded to bind 8.4.7.  I'll be thrilled to hear a different solution. (without damaging the cache size)

Best Regards,
Meron Brandeis
System Unix
Barak ITC, Ltd.

-----Original Message-----
From: bind-users-bounce at isc.org [mailto:bind-users-bounce at isc.org]On
Behalf Of Pawel Rogocz
Sent: Sunday, August 20, 2006 1:48 AM
Cc: bind-users at isc.org
Subject: Re: bind-9.3.2 / CPU issue.


This issue has been troubling us for almost two years now, since we
deployed BIND9.

We have bunch of nameds running behind load balancer, getting on average
1k dns queries per second each.

We currently run with watchdogs which kill named if it starts using 100%
CPU.

Just recently I noticed that when named enters this state, it starts
replying with erroneous data.

For example, cached data never gets its TTL decreased,

www.sun.com has always TTL of 900. Also queries of type ANY
against authoritative data intermittently fail wirth SrvFail error. 

We also see increased number of Udp InErrors in /proc/net/snmp when
named enters this state.

We have run with all sorts of Linux 2.2/2.4 kernels and the problem was
always there. We curently run 9.3.2 with internal malloc enabled.


Pawel


On Tue, Aug 15, 2006 at 02:53:43PM -0700, Kelsey Cummings wrote:
> FWIW, I've seen similar behavior on some of our recusive servers in
> specific roles.  The only thing that might be unusual about our config is
> that a very high portion of the requests are going to forwarded zones.
> 
> It's be a consistent problem for us through all versions of bind 9 - we've
> had to us bind 8 to keep them stable.  We suspected it could be a problem
> with our compiler/libraries but the problem consistently occurs regardless
> of what distribution or version we try to run.  (All linux.)
> 
> It seems to be load related - only affects two of our internal recursors
> that do ~1k reqs/sec whereas our other more lightly loaded servers don't
> exhibit the same exact symptoms (although they also have been known to spin
> on the CPU.)
> 
> -- 
> Kelsey Cummings - kgc at corp.sonic.net      sonic.net, inc.
> System Architect                          2260 Apollo Way
> 707.522.1000                              Santa Rosa, CA 95407
> 


-- 





**********************************************************************
The information contained in this e-mail message may be
privileged and confidential. The information is intended only 
for the use of the individual or entity named above. If the 
reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying
of this communication is strictly prohibited. If you have
received this communication in error, please notify us 
immediately by telephone, or by e-mail and delete the message
from your computer. Thank you!
Unless otherwise stated, any views or opinions expressed in
this e-mail are solely those of the author and do not represent those of 
Barak I.T.C (1995) The International Telecommunications Services Corp. Ltd.
**********************************************************************



More information about the bind-users mailing list