DNS Redundancy
Niall O'Reilly
Niall.oReilly at ucd.ie
Thu Oct 21 11:58:35 UTC 2010
On 21 Oct 2010, at 12:32, Martin McCormick wrote:
> The normal procedure on internet-connected systems is to
> set the resolv.conf file to include at least 2 domain name
> servers. Example:
>
> nameserver 139.78.100.1
> nameserver 139.78.200.1
>
> Last night, I had to take down our primary DNS for
> maintenance and lots of FreeBSD and Linux systems began having trouble of various
> kinds.
>
> While I expected the FreeBSD system I was on to hang for
> a couple of seconds and then start using the second DNS, it
> basically froze while some Linux boxes also began exhibiting
> similar behavior.
>
> I finally manually changed the resolv.conf on the system
> I was using to force the slave DNS to be first in the list and
> that helped, but loosing the primary DNS was not the slight
> slowdown one might expect. It was a full-blown outage.
It's a good idea to keep your authoritative name service
(for announcing DNS records for your part of the DNS) separate
from your resolver name service (for mediating name service
to the clients on your network).
/etc/resolv.conf (or equivalent on other platforms) specifies
where the client should look for resolver service. The
addresses in there should best not be those of the master
or slave server for your DNS zone(s).
Without more detail, it's difficult to say exactly what chain
of cause and effect led to your "full-blown outage".
It's well to bear in mind that the typical (Unix-like) client
will always step through the nameserver addresses in the order
in which they appear in /etc/resolv.conf. If you're planning to
take one of them down for maintenance, and wish to avoid
client-side delays, you need either to configure the clients
in advance (for example, by using DHCP) with a different
/etc/resolv.conf. Alternatively, you might instantiate the
first address in the list on the second server. There is no
"one true way".
On the other hand, dedicated resolver servers (at least those
running BIND named) keep track of the state of the authoritative
servers for the names for which they are processing queries, and
automagically ignore any that are unreachable. This allows my
customers (for example) to be spared delay when you take one of
your authoritative servers down.
Best regards,
Niall O'Reilly
More information about the bind-users
mailing list