EDNS and BIND 9.2.3?

Fri Feb 11 02:45:28 UTC 2005

    >> Hi All: I have BIND 9.2.3 running on FreeBSD 5.2.1 The
    >> installation has been running fine for about a year now, but
    >> BIND has stopped resolving external queries with the error:

    >> "connection timed out; no servers could be reached".

This usually has two causes. [1] Network connectivity or a firewall is
blocking or dropping DNS packets. [2] You are querying name servers
that are dead.

It could also happen if your server's OS couldn't allocate network
buffers. But in that case it would be obvious your server was sick and
there would be loud screams from the OS in the logs and on the system
console.

BTW, the error message you report comes from dig, not the name server.
So are you sure it's your name server that's broken?

    >> I've dug around and found some mention about TLD servers
    >> returning ipv6 "extended" DNS responses greater than 512 bytes,
    >> which may be causing BIND to timeout. I tested with:

    >> dig +norec +bufsize=4096 soa com @a.root-servers.net

    >> and DID get a response, so I assume that the EDNS response is
    >> the problem.

Wrong. The fact you got a response to that query proves that EDNS0 is
working fine. Your server might be sending EDNS0 queries to a badly
broken server that doesn't understand them and fails to send a reply
indicating that, as it should. [When BIND9 encounters a server that
says it can't speak EDNS0, it automatically remembers that and repeats
the query with ENDS0 switched off.] If this is happening, you can
verify that by querying the server(s) directly and then telling your
servers not to use EDNS0 when talking to those broken name servers.

    >> My question is, are there any resolutions to this problem other
    >> than upgrading to 9.3.0 and turning off ipV6 with "named -v4"?

You should be running 9.3 anyway. However that has no bearing on the
problem you reported. It doesn't even appear to be a problem with your
name server. And what makes you think IPv6 transport has any impact on
resolution? You've presented no evidence that IPv6 is involved.

    >> If I need an immediate resolution, is there any downside to
    >> upgrading to 9.3.0 immediately? Or is there anything else I can
    >> do to get the existing 9.2.3 to work?

You could start by telling us which names you tried to resolve and
what servers were queried when you got the error message. But before
you do that, check where dig is sending its queries. You're probably
sending queries to an address that doesn't have a name server on
it. Next, go through the change control logs and see if the failed DNS
resolution(s) correspond to any recent changes to router or firewall
configurations. There is more than likely a local misconfiguration
that is blocking previously allowed DNS traffic.

I think you've not taken a systematic approach to troubleshooting the
problem. You said that all of a sudden DNS resolution starts to fail
with connection timeout errors. That can't be attributed to the
version of BIND you're running. Why would the server's behaviour just
change like that? There has to be another explanation: eg a change to
a router or firewall configuration or some sort of network connectivity 
problem. And since the error message isn't coming from the name
server, it's unlikely to be the source of the problem at all.