Bind9 Crazy-high CPU on Linux

Schlosser, Matt D. mschlosser at
Mon Jan 15 21:03:39 UTC 2007

P3 1400mhz
Debian Sarge (3.1), also tried Etch (4.0 beta)
Bind 9 9.2.4-1sarge1 (9.2.4 with debian patches) also tried 9.3.2-P1.0.1
(9.3.2 with debian patches)
Kernel 2.4.25 and 2.6.8-2 (2.6.8 with Debian patches)

There are two of these machines with the following issue:

The machines run between 800 and 1,000 queries/second for both
authoritative and recursive zones.  After 12-24 hours, the CPU will
spike to 100% and sit there while the machine times out any more
queries.  The only resolution is to restart bind.

After exhausting efforts to localize the problem (kernel, nic drivers,
Bind version, configs, etc.), we installed Windows 2003 and Bind 9.3.3
for Windows on one of the servers and copied the configs and zone files.
The server has been up for 4 days with no problems.

We prefer not to run on Windows mainly because our entire DNS
infrastructure is Linux-based and we would have to redevelop our
provisioning and management systems which is a rather large undertaking.
We won't get budget help to throw more horsepower at it since Windows is
a solution.  Besides, we are not convinced that will help.

As anyone seen/resolved this before?


