Couldn't resolve certain major sites

MBernha at bart.gov MBernha at bart.gov
Thu Oct 26 18:29:30 UTC 2006


We are running BIND 9.2.3 (yes, I know I need to upgrade, I'm working on
it). Yesterday we had a problem in which suddenly after 2 years of
flawless operation, we could no longer resolve Google, Yahoo and Microsoft
queries. After about 3 hours, Yahoo was resolvable but the others were
not. Other sites were fine, though "slow." Although it is possible that
our problems were due to faulty router hardware, I need to go through this
question first, so please bear with me. I looked through the archives but
found no directly usable information.

I reconfigured our DNS servers to forward-only to our ISP's servers, and
everything worked. I looked at our configuration and we have been set up
so that we were using source port 53 for queries. I removed that, removed
the forward-only, and everything worked normally. I also had to push a new
access list to our outside routers to make this work, hence the doubt
about router hardware (the only change was to allow source port 53 to
high-numbered ports on our DNS servers). Our firewall policy, inside those
routers, did not change.

Doing a packet trace, I saw that requests to Microsoft simply were not
answered. The packet trace lined up with the firewall log in that the
firewall was not dropping any queries or replies that came its way. I did
notice a single ICMP administrative-prohibited if I restarted BIND and
queried again.

My questions:
1. Do these site suddenly now require BIND default behavior i.e.
high-numbered source port? Is it now considered unacceptable to source
from port 53?
2. What is BIND's behavior when this message is received? 
3. With Google in particular, we found that we could resolve "google.com"
but not "www.google.com." I never even saw the query hit the sniffer,
presumable because DNS had already cached a negative response. Using dig
would turn up SERVFAIL. Any idea on this?
4. After I began allowing named to use high-numbered source ports, dig
commands such as +trace began working on all domains- they didn't work on
any previously. Is there a good reason for this?
5. Do you think the problem described were related to BIND at all or
should I be focusing on other causes?

Thanks for your help! If you need more information please ask.



More information about the bind-users mailing list