Disaster Recovery Bind architecture

Thu May 22 00:41:24 UTC 2008

atlantic wrote:
> Hello,
>
> I've searched, but not found anything on this specific topic. I am about
> to implement two disaster recovery site nameservers; one internal, one
> external. I want to keep the internal entries strictly internal. The
> external will serve nat'd addresses of the internal nameserver as well
> as function as a slave to the primary site nameserver.
>
> I would have no problem implementing this model if the domain names at
> the DR site was different from the primary site. My issue is that
> because I am using the same domain name, I have had to create a custom
> scripted solution to allow the loading of split domain resource records
> (using $INCLUDE directives, and sed/awk to remove SOA and header
> information from the imported data files.) The fact that this does work
> does not negate the issue that I find the solution cumbersome. The issue
> would be much more simple if I change the DR site to a different domain
> name, since the resource record SOA would be different.
>   

I'm confused: why do you need to do this "munging" of the zonefile? As 
far as I can understand it, the only difference between the original 
version of the zone and the "munged" version would be the SOA record and 
the apex NS records (that's what you mean by "header information" 
right?). But nothing really cares about the SOA record (except Dynamic 
Update clients and, in a multi-level slaving hierarchy, mid-level 
slaves, who use the MNAME field of the SOA record in determining who 
gets NOTIFYs), and if you put the "primary" NS(es) and the DR NS(es) at 
the apex of the zone, Internet resolvers will quickly find and use the 
DR nameservers if the primary ones(s) is/are down. So there's no real 
reason for the "header" of the zone to be different on different 
nameservers, and no "munging" should be required.

Secondly, I don't know what you're getting at with "The external will 
serve nat'd addresses of the internal nameserver". NAT or no NAT, why 
would you want Internet resolvers querying your internal nameserver? 
That seems like a bad security practice to me. A lot of DNS-based 
exploits have been identified over the years, so I'd rather only expose 
nameservers that are on the "edges" of my network.
> Now that I have stated my issue, my real questions are:
>
> 1) How do most businesses address this issue?
>   
A variety of different ways, I'd imagine. In our case we have two main 
production datacenters that have (diverse) connectivity to the Internet 
and for most apps (e.g. web stuff) we use "global", DNS-based load 
balancing to allow the servers to run in both datacenters with the 
failover being automatic if the server(s) in one datacenter are down, 
e.g. in the worst case, the whole datacenter is down. For DNS itself, 
since it can't really be load-balanced using DNS (slight chicken-and-egg 
problem there), we have one VIP (virtual IP) for each set of DNS servers 
at each datacenter, i.e. "local" load-balancing. So Internet DNS 
resolvers will only see two VIPs associated with the nameservers for our 
external zones, but there are multiple machines "behind" each VIP so 
that we have transparent fault-tolerance within any given datacenter, 
and if one datacenter should go down completely, we still have 
functioning nameservers in the other datacenter.
> 2) Is it normal to have a DR DNS function as both a slave to the primary
> site and a primary to different DR resources?
>   
I doubt it. Mixing up master and slave roles on various Internet-facing 
nameservers seems to me to be unmanageable and arguably insecure. Much 
simpler for them all to be slaves. (Note that I'm using the term 
"slaves" loosely here; if one wants to use another replication method 
besides AXFR/IXFR, then that's fine, and I'd still call the replicas 
"slaves" in the loose sense).

In our case, we centralize all of our external DNS maintenance on an 
internal server (with another internal server as backup), and then all 
of the Internet-facing nameservers are simply slaves for that data.

> 3) Is is acceptable to have all three nameservers (primary site, DR
> primary, DR secondary) have the serve the same domain name?
>
>
>   
Hmmm... why not? The more authoritative nameservers that are published 
for the zone, the more the query load is spread out, and the less impact 
there will be if any given one of them fails or becomes unavailable.

Some registries have limits on how many nameservers they'll allow in a 
delegation, but even if you just have a subset of your authoritative 
nameservers in the delegation records, as long as they are all in the 
apex NS records they'll get used (assuming that a sufficient number of 
the resolvers cache NS records according to ranking rules in RFC 2181, 
which ranks in-zone data above referral data). Don't go overboard with 
NS records, though; you don't want to have so many that you force older 
DNS resolvers into TCP retries. Try to keep the referral responses (NS 
RRset + glue) within 512 bytes, taking into account label compression.

                           - Kevin