How to prevent slaves from contacting master for name resolution?
kcd at chrysler.com
Mon May 17 23:24:18 UTC 2010
On 5/17/2010 5:58 PM, Keith Christian wrote:
> Our redundant DNS configuration is one master and three slaves, spread
> across two colo facilities.
> master and slave1 are in colo_ALPHA.
> slave2 and slave3 are in colo_BETA.
> During an extended maintenance window, the master DNS was offline.
> Slave2 was trying to contact the master, and lookups failed. Usually,
> slave2 resolves without contacting the master, but occasionally it
> The IP for the master does not appear in slave2's /etc/resolv.conf,
> and I'm not sure what else to check for on slave machines. Where else
> would I look? Would any settings in named.conf account for this
> Versions are Linux (CentOS 5) and BIND 9.5.x.
These queries that were failing, were they queries from external
clients, queries you were generating on the slave nameserver itself, or
some other queries entirely?
If queries from external clients were failing while the master was down
and the slaves were up, then the most likely cause is that your NS
records are screwed up, such that your slaves aren't being found. Either
you have the wrong names in your delegation NS records, or those at the
apex of the zone, missing/incorrect glue in the parent zone, something
If queries you were generating *locally* were failing, the questions are:
a) are you quering the slave directly?
b) does the slave have recursion turned completely off?
If you're querying the slave directly and recursion is off, and the
queries are failing, then this should have nothing whatsoever with the
master being down, since the data comes *only* from the slave. You
should look at whether the zone is being loaded correctly on the slave,
a corrupted journal file, expired zone, something like that.
If you're querying the slave directly and recursion is *on* for the
slave, then you should check whether the slave is actually responding
authoritatively for the zone in question under normal circumstance. If
not, then something is misconfigured, the slave isn't really a slave at
all, under normal circumstances it's just recursing to get the
information, and it makes sense that when the master (and/or other
slaves) are unavailable, queries will fail.
If you're querying some other box, then you'd need to look at that other
box into why the queries were failing. At that point, the fact that
you're generating queries from the slave itself has no bearing on the
problem, since the failed queries weren't going to the slave in the
Another possibility: were you using "nslookup" to test queries, and does
the master for your forward zone(s) also happen to be authoritative for
the *reverse* zone which contains the address of your resolver? If so,
then be aware that "nslookup" has the annoying feature of trying to
reverse-resolve the name of the resolver it's using. So, maybe the
queries you're seeing going from slave to master are *reverse* DNS
queries, and if they're failing because the master is down, "nslookup",
in its quaint, eccentric way, may be misreporting this as a general
lookup failure, thus making you think that the forward name is unresolvable.
More information about the bind-users