Disaster Recovery Bind architecture
kcd at chrysler.com
Thu May 22 00:41:24 UTC 2008
> I've searched, but not found anything on this specific topic. I am about
> to implement two disaster recovery site nameservers; one internal, one
> external. I want to keep the internal entries strictly internal. The
> external will serve nat'd addresses of the internal nameserver as well
> as function as a slave to the primary site nameserver.
> I would have no problem implementing this model if the domain names at
> the DR site was different from the primary site. My issue is that
> because I am using the same domain name, I have had to create a custom
> scripted solution to allow the loading of split domain resource records
> (using $INCLUDE directives, and sed/awk to remove SOA and header
> information from the imported data files.) The fact that this does work
> does not negate the issue that I find the solution cumbersome. The issue
> would be much more simple if I change the DR site to a different domain
> name, since the resource record SOA would be different.
I'm confused: why do you need to do this "munging" of the zonefile? As
far as I can understand it, the only difference between the original
version of the zone and the "munged" version would be the SOA record and
the apex NS records (that's what you mean by "header information"
right?). But nothing really cares about the SOA record (except Dynamic
Update clients and, in a multi-level slaving hierarchy, mid-level
slaves, who use the MNAME field of the SOA record in determining who
gets NOTIFYs), and if you put the "primary" NS(es) and the DR NS(es) at
the apex of the zone, Internet resolvers will quickly find and use the
DR nameservers if the primary ones(s) is/are down. So there's no real
reason for the "header" of the zone to be different on different
nameservers, and no "munging" should be required.
Secondly, I don't know what you're getting at with "The external will
serve nat'd addresses of the internal nameserver". NAT or no NAT, why
would you want Internet resolvers querying your internal nameserver?
That seems like a bad security practice to me. A lot of DNS-based
exploits have been identified over the years, so I'd rather only expose
nameservers that are on the "edges" of my network.
> Now that I have stated my issue, my real questions are:
> 1) How do most businesses address this issue?
A variety of different ways, I'd imagine. In our case we have two main
production datacenters that have (diverse) connectivity to the Internet
and for most apps (e.g. web stuff) we use "global", DNS-based load
balancing to allow the servers to run in both datacenters with the
failover being automatic if the server(s) in one datacenter are down,
e.g. in the worst case, the whole datacenter is down. For DNS itself,
since it can't really be load-balanced using DNS (slight chicken-and-egg
problem there), we have one VIP (virtual IP) for each set of DNS servers
at each datacenter, i.e. "local" load-balancing. So Internet DNS
resolvers will only see two VIPs associated with the nameservers for our
external zones, but there are multiple machines "behind" each VIP so
that we have transparent fault-tolerance within any given datacenter,
and if one datacenter should go down completely, we still have
functioning nameservers in the other datacenter.
> 2) Is it normal to have a DR DNS function as both a slave to the primary
> site and a primary to different DR resources?
I doubt it. Mixing up master and slave roles on various Internet-facing
nameservers seems to me to be unmanageable and arguably insecure. Much
simpler for them all to be slaves. (Note that I'm using the term
"slaves" loosely here; if one wants to use another replication method
besides AXFR/IXFR, then that's fine, and I'd still call the replicas
"slaves" in the loose sense).
In our case, we centralize all of our external DNS maintenance on an
internal server (with another internal server as backup), and then all
of the Internet-facing nameservers are simply slaves for that data.
> 3) Is is acceptable to have all three nameservers (primary site, DR
> primary, DR secondary) have the serve the same domain name?
Hmmm... why not? The more authoritative nameservers that are published
for the zone, the more the query load is spread out, and the less impact
there will be if any given one of them fails or becomes unavailable.
Some registries have limits on how many nameservers they'll allow in a
delegation, but even if you just have a subset of your authoritative
nameservers in the delegation records, as long as they are all in the
apex NS records they'll get used (assuming that a sufficient number of
the resolvers cache NS records according to ranking rules in RFC 2181,
which ranks in-zone data above referral data). Don't go overboard with
NS records, though; you don't want to have so many that you force older
DNS resolvers into TCP retries. Try to keep the referral responses (NS
RRset + glue) within 512 bytes, taking into account label compression.
More information about the bind-users