Slaves not picking up when master is offline

John Wobus jw354 at cornell.edu
Fri Dec 21 20:59:34 UTC 2007


You probably need to post actual zones and configs, but here are some 
things to check:

(1) Verify the delegations directly, e.g. with dig rather than whois.
(2) Make sure the slaves are authoritative.
(3) Make sure the slaves actually have the copies of zone files that 
they should.
(4) Make sure the slave's responses for these domains are flagged 
"authoritative", e.g. with dig.
(5) Do this check from a customer site, to check for access problems 
due to firewalls, access lists, routing, etc.

Authoritative slaves do not actually "pick up" that the master is down: 
they all should be functioning all the time.  From a DNS delegation and 
authority point of view, they are peers.  The DNS "master/slave" 
mechanism only deals with how data is loaded into the servers.  Should 
it be that the slaves are never responding correctly, this could be 
masked by normal DNS caching, in which case, a problem might only be 
noticed when the one working authoritative nameserver is down.

The 14400 TTL (i.e. 4 hours) in conjunction with your observation that 
the symptom happens at 4+ hours hints that the slaves are indeed never 
functioning as they ought.  DNS records cached from the master would 
time out within four hours of when the master goes down.

John Wobus
Cornell University IT




On Dec 21, 2007, at 4:42 AM, Andre wrote:

> I'm having a problem with my DNS-setup. I'm running 1 master and 2 
> slave nameservers. These servers are authoritative for about 20,000 
> domains which we registered for our clients.
> The problem is; When the master server is offline for several (4+) 
> hours, the slaves don't seem to pick this up properly. Several of our 
> customers can't connect to their domains because of unresolving 
> hostnames.
>
> AFAIK the set-up for the domains is ok. Here's an example of such a 
> domain in the masterserver.
>
> $TTL 14400
>
> @ IN SOA ns1.nameserver.net.  info.registrar.net.   (
>                                          2007102401  ; serial
>                                          10800  ; refresh
>                                          3600  ; retry
>                                          604800  ; expire
>                                          86400)  ; min_ttl
>
> domain.net.         IN NS ns1.nameserver.net.
> domain.net.         IN NS ns2.nameserver.net.
> domain.net.         IN NS ns3.nameserver.net.
> mail.domain.net.    IN A *.*.*.*
> www.domain.net.     IN A *.*.*.*
> localhost.domain.net.       IN A 127.0.0.1
> domain.net.         IN A *.*.*.*
> ftp.domain.net.     IN A *.*.*.*
> webmail.domain.net.         IN A *.*.*.*
> *.domain.net.       IN CNAME cname.domain2.net.
> domain.net.         IN  MX 30 mailserver.registrar.net.
> domain.net.         IN TXT "v=spf1 a mx ~all"
>
> - All domains in the masterserver are defined as 'type master' in 
> named.conf
> - All domains in de slaveservers are defined as 'type slave' and 
> contain a reference to the master
> - All 3 nameservers are listed correctly at our registrars, checked 
> through whois
> - zone-transfers and updates are working fine, so communication 
> between master and slaves is ok
>
> The only possible explanation for this problem that I can come up with 
> is that the nameserver's hostnames (nameserver.net in this example) 
> are also resolved by these servers, the nameservers are authoritative 
> for their own hostname. Is this a problem? As far as I know the 
> registrars keep a hostname as well as an IP-adress for these 3 
> nameservers, so this shouldn't be a big problem.
>
> Does anyone have any idea what could be wrong with this setup?
>
> Thanks a lot in advance!
>
> Andre P.
>



More information about the bind-users mailing list