9.3.0 and 9.3.1 blocking durring maint interval
jcoombs at gwi.net
Tue Mar 15 15:51:53 UTC 2005
While trying to track down periodic radius failures, I discovered that Bind
was periodically timing out, and even occasionally incorrectly responding
with a failure. We orriginally were running 9.2.3 built from ports on
FreeBSD 4.9p11, with a mem limit set at 900M, maint interval of 60 minutes.
The failures were 61 minutes appart, like clockwork.
We moved up to 9.3.0, again built from ports, and continued to observe the
same problem. I then built from src, enabling threading, with no luck. A
quick discussion with the port maintainer pointed out that 9.3.1 would have
'major threading fixes' for FreeBSD, so I waited for it to come out. Now
that it's out, I've built it, threading enabled, and still have the periodic
outages. I've currently got the maint interval set at 15 mins, and my
problems are tracking the period like clock work.
At the moment, my primary source of data comes from my radius server
monitoring, as I don't have a direct long term dns monitor going yet. I've
been testing by throwing nslookup requests inside while loops from cli and
observing the output.
The host system for bind is running 9 to 14% cpu load, even durring the
maint windows, so I don't believe the host system is overloaded.
How should I proceed to diagnose and correct this?
More information about the bind-users