Query sent, but no response
Dan Lowe
dan at tangledhelix.com
Fri Sep 23 08:39:06 UTC 2005
I'm seeing an odd problem on my recursive resolvers that I can't
quite figure out. Given queries in the sfchron.com zone (say,
querying for mr1.sfchron.com. A) I am unable to get a response. If I
log into the resolver host and use "dig" against the authoritative
servers, I get a response, i.e.
dig @ns1.sfchron.com. mr1.sfchron.com. a
dig @ns2.sfchron.com. mr1.sfchron.com. a
However, if I dig @localhost (i.e. I am asking my resolver for the
answer) then I get a varying response. On several of the resolvers, I
get the same answer as the above queries. On others, I get a timeout.
But on all resolvers, if I dig @nsN.sfchron.com it works; only on
some does it timeout when I ask localhost.
On the hosts where I get timeouts, other queries against localhost
work fine (for instance, yahoo.com. MX).
All resolvers are running BIND 9.3.1 on Sun Solaris 8 SPARC. I have
reviewed the BIND configs to ensure the sfchron.com nameservers are
not blackholed.
Results from snoop and tcpdump are consistent with the errors
returned; i.e. when the client returns a timeout, I see a request go
out, but no answer comes back.
Has anyone seen this before? I've seen similar behavior in the past,
but it always turned out to be something simple (the remote end was
filtering us, or we had the remote site's IP in our blackhole list...)
Lastly, the pattern appears to change somewhat over time. Of my 17
resolvers, on one day a certain subset get timeouts. The next day it
is a different set. The next a different set (though a small handful
seem to constantly be present on the timeout list, I am so far
considering that to be a coincidence).
Thanks for any help you can offer.
-dan
--
logic (n.): the art of being wrong with confidence.
More information about the bind-users
mailing list