slow query times after Bind upgrade
diggins at McMaster.CA
Sat Feb 11 23:00:32 UTC 2006
Last week I began the process of upgrading my two slave name servers from
Bind 9.2.4 to 9.3.2. I completed one so far. Everything appeared to be
fine but later I noticed that a few nslookups from a Windows workstation
occasionally fail with a "DNS Server timeout. 2 seconds". I began testing
5 external domain names against my 9.2.4 slave and 9.3.2 slave and find
the original always succeeds, but the upgraded server times out
occasionally. The trace shows that the query is always answered, but just
not always within the two second timeout window. I would think a cached
entry would fit within that 2 second window.
I've sniffed the wire at both the workstation and server end and can see
the delayed response which appears to be at the server itself (ruling out
the network and workstation). I'd like to see exactly what is happening on
the server to the queries that are failing. What is the right debug level
for bind to catch this information? Any other observations that might
explain the different behaviour?
P.S. The original slave (9.2.4) is an Sun Ultra10 running Solaris 8. The
updated server (9.3.2) is a Sun V210 (dual processor) running Solaris 10.
Bind compiled for multiprocessor support. One other thing I've noticed is
that the default timeouts for the nslookup included with Solaris is much
higher than two seconds. Hmmm.
More information about the bind-users