peculiar lookup timeouts
adamy at mountaincable.on.ca
Wed Sep 13 21:44:10 UTC 2006
> > Hey guys,
> > I tried searching the list for the answer to my question, and I saw some
> > similar questions, but I wasn't sure that it was exactly the same.
> > So here is the scenario:
> > Running Bind-9.3.2P1, three servers: ns1(master) ns2,ns3(slaves). Every
> > once in a while a customer will call in saying that they cannot resolve
> > particular domain.
> > So, I attempt to look up the domain via "ns1" and the lookup times out.
> > lookup via ns2 or ns3 and it works (sometimes).
> > As soon as I restart bind, everything works again.
> > Also, we have an internal copy of bind running, which forwards queries
> > "ns1" and at the time when ns1 does not get an answer (while it is
> > out) that system answers with the right data.
> > If there is any more detailed information I can provide, please let me
> Is ns1 the master for the zones that are having the problem? Or, are you
> calling it a "master" even though, for purposes of troubleshooting this
> problem, it's basically just a resolver? If it's just functioning as a
> resolver, are the domains you're having problems with Internet domains
> or domains that are strictly internal to your environment? If Internet
> domains, then please enumerate them so we can take a look and see if we
> can spot any obvious problems.
'ns1' is a master for a number of zones, however not for the ones it is
having problems resolving. Other than the zones it is master for, it is
used as a resolver.
The domains it is/was having problems with were, for example, godaddy.com.
That was the most recent culprit. I would receive "SERVFAIL" on 'ns1', then
I restarted the bind daemon, and it then returned the proper information.
When I did a 'dig godaddy.com @ns1 +trace' it would get to the nameservers
for the domain, but not be able to resolve the domain itself. Again, after
restart of the bind daemon, it would be able to resolve the domain.
I'm not sure if this has anything to do with negative caching, or caching in
general, but I would assume it does since a restart of the daemon fixes the
problem. I do limit bind to 3GB of cache as of my latest revision of
Thanks for the help,
p.s. I'm not sure what the proper educate for this list is, when replying,
should I include the sender? Or just to the list itself, since you
Systems Support Technologist
Mountain Cablevision Ltd.
More information about the bind-users