DNS Round Robin

Kevin Darcy kcd at daimlerchrysler.com
Mon Apr 10 20:51:37 UTC 2000


Sounds like you're trying to set up a primary-backup kind of architecture, and
trying to use DNS to make it all work. Here's a response I gave recently to
someone trying to do the same thing:

---

There are 2 basic DNS-based approaches to this problem, each with serious
drawbacks:

1) Define the name with multiple A records, set a "fixed" rrset-order on
the master and slaves.
DRAWBACKS: A) unless you can configure this on all of the slaves, and all
servers which may potentially cache the name -- if the name is an Internet
name, then forget it -- you're going to get a certain amount of
"leakage" to your backup server, since caching servers will usually
round-robin answers from cache. You can minimize the effect of the caching
servers by lowering the TTL values on the records, but only at the cost of
increasing DNS traffic, B) each client needs to be smart enough to
failover to the second IP in the list it gets from the nameserver. Not all
clients -- especially older clients -- are this smart.

2) Define the name with a single A record and then change it -- using
Dynamic Update or some other mechanism -- when that host fails. DRAWBACK:
as with option #1, caching is going to get in your way here, not to
mention the fact that the slaves may take a while to get the change, even
if they are NOTIFY-aware. Again, you can minimize the effect of caching by
lowering TTL values and putting up with the increased traffic, but unlike
option #1, where round-robin'ing caching servers will at least give out a
working address first in the list 50% of the time, even during an outage,
with option #2 when the "primary" is down, clients will get the
non-working address 100% of the time until their local caching server
times out the cache entry and fetches the changed A record. Depending on
the protocol and the client software, a 50% connection failure rate may
still allow the users to continue working -- although probably with
degraded performance -- and may therefore be preferable to a temporary
100% failure rate.

Of course, there are non-DNS-based approaches to this problem also,
usually involving router or router-like hardware or software. In this
case, you typically have an invariant IP address which is presented to the
rest of the world, and then the packets are re-routed "behind" that
IP address in case of failure. Most of these products can also do real
Dynamic Load Balancing. They're generally pretty expensive, though.

Last but not least, SRV records, which provide a "service
location" mechanism, also have "preference" and "weight" fields, that in
theory allow one to implement load-balancing and/or redundancy without all
of the caching complications. Unfortunately, the client software needs to
be SRV-aware in order for this to work, and to date there aren't any
SRV-aware clients for popular protocols like HTTP and FTP. In fact,
I think the only SRV-aware client is the Win2000 client, and that it only
uses SRV's for Active Directory-related stuff.

In my opinion, there really ought to be a record type the sole purpose of
which is for servers to communicate to each other how to order RRsets.


- Kevin



Masataka_Tanaka wrote:

> # Prompt reply is very very welcome q(^o^)p
>
> Hello, there.
>
> My current BIND working environment is under Solaris 2.6 and BIND 8.x .
> I would like to ask you all the question about DNS round robin.
> When we set several hosts as Round Robin, do they have fail-safe
> function or not?
>
> I shall tell you about my basic trials.
>
> Case1 - DNS Setup
>
> 1)  First, I set up two hosts as having same name and different IP.
>     They named Robin.domainname in FQDN.(y.y.y.0/24 segment)
>
>     # nslookup
>     > Robin.domainname
>     Name:    Robin.domainname
>     Addresses:  y.y.y.95, y.y.y.109
>
>     > y.y.y.95
>     Name:    Robin.domainname
>     Address:  x.x.x.95
>
>     > y.y.y.109
>     Name:    Robin.domainname
>     Address:  y.y.y.109
>
> 2)  Send the ICMP packets from some nameserver via 'ping' command.
>      ( Network Layer; OSI layer 3)
>       itcns2# ping mail.uhclan.sony.co.jp
>       ICMP Host redirect from gateway (x.x.x.254) to x.x.x.247 for y.y.y.95
>       Robin.domainname is alive
>
>        On this host, Name Service Cache Daemon (nscd) works, and looks
>        Cached result is quoted for next DNS lookup.(within the range of TTL)
>
> # Q1.    When 'nscd' process stopped at working DNS server, does it
>            influent for named service or some cache??
>
>        When I stop one of the hosts in Round Robin and type 'ping', I got
> the
>        result shown as below.
>
>        [ Alive Host ]
>
>        nameserver# ping Robin.domainname
>        ICMP Host redirect from gateway (x.x.x.254) to x.x.x.247 for
> y.y.y.109
>        Robin.domainname is alive
>
>        [ Dead Host ]
>
>        nameserver# ping y.y.y.95
>        ICMP Host redirect from gateway (x.x.x.254) to x.x.x.247 for y.y.y.95
>        no answer from y.y.y.95
>
> On network layer level, it looks fail-safe works due to ROund Robin setup.
>
> Case2 - Shutdown one of hosts during ICMP packets transmit, and check
>             fail safe via DNS Round Robin Configuration
>
> 1. Type the 'ping' command.
>
>     nameserver# ping -s [ RoundRobin_HOST FQDN ]
>
> 2. ICMP packet trasnsmitted regularly, and one of hosts replyed.
>
>     ICMP Host redirect from gateway (x.x.x.254) to x.x.x.247 for y.y.y.109
>     64 bytes from y.y.y.109: icmp_seq=0. time=3. ms
>     64 bytes from y.y.y.109: icmp_seq=32. time=2. ms
>     . . . . . . . .
>     64 bytes from y.y.y.109: icmp_seq=42. time=2. ms
>
> 3. Shutdown the alive host.
>
>     ICMP Host redirect from gateway (x.x.x.254) to x.x.x.247 for y.y.y.109
>
> 4. Interrupt ICMP transmission because I confirmed host doesn't
>    reply anymore.
>
>    ----mail.uhclan.sony.co.jp PING Statistics----
>    107 packets transmitted, 43 packets received, 59% packet loss
>    round-trip (ms)  min/avg/max = 2/3/12
>
> # Q2.    Is there any way to realize fail-safe(switch-over) via DNS
> configuration?
>            ( I mean the definition of 'fail-safe' :
>               One of Round Robin hosts is dead, avoiding DNS lookup returns
> the Dead
>               host information.)
>
>             Ohterwise, is there any SOFTWARE tool monitoring hosts or ports
> of DNS hosts?
>             (if possible freeware)
>
> I am looking forward to getting your reply soon.
> Bye for now, and thank you so much for reading to the last.
>
> --
> Thanks & Have a good day !
> Sony Systems Design Corp.
> Customer Service Dept.
> Masataka TANAKA
> mail to :  mtanaka at ssd.sony.co.jp
> tel       :  +81-3-5479-6629 (Tokyo, Japan)






More information about the bind-users mailing list