BIND 9.x caching Performance under heavy loads

Kelsey Cummings kgc at
Fri Mar 25 00:57:37 UTC 2005

On Thu, Mar 24, 2005 at 11:00:21PM +0100, sthaug at wrote:
> > 	cleaning-interval 60; //Default?
> > 	max-ncache-ttl 120;
> max-ncache-ttl counts in seconds, according to the documentation. Why
> do you want to limit this to 2 minutes?

Pressure from vocal customers about things getting 'stuck' in the ncache
and causing me political grief.

> Aside from that, I can only recommend the same as has been mentioned
> by several others on the list:
> - Make sure your name server has enough memory. Let the cache reach its
> "natural" size (let it stabilize), don't try to limit it.

I wasn't, beyond using the stated cleaning-interval.  Which I've now set to
0 to see how it affect long term CPU usage vs memory consumption.  Am I
right in guessing that no cache cleaning is ever done if this is set to
0?  Will this result in a name server with a stable memory foot print?

> - Look at how much logging is turned on. Logging can be a real CPU hog.

Well, CPU usage under normal operation isn't the problem here.
> Also, if you have a uniprocessor system, you might want to test the
> patch I recently posted to enable better BIND9 statistics.

The systems are all P4 CPUs with HT enabled. 

While the comments are appreciated they don't address the specific problems
which is bind going into a 100% CPU spin for no apparent reason after a long
period of stable operation.  During this time it's still answering reqeusts
but with delays in the 10-30 second range.

Anyone else having this problem?  Any suggested fixes?  If I catch it this
way again can I get debugging information out of it that would help a bind
wizzard track the problem (and perhaps fix a bug?)  Notably, my dual
processor name server do not exhibit the same problems with identical
configurations, binaries, kernel, etc.

