Can DNS Round Robin be fault tolerant?

Kevin Darcy kcd at daimlerchrysler.com
Tue Jan 10 21:35:51 UTC 2006


RaysOfSearch wrote:

>I am trying to use DNS round robin for load balancing multiple web
>servers. I have read that there are some problems with this approach
>however, specially related to fault tolerance and session maintenance.
>By fault tolerance I mean if a web server is down, the DNS server has
>no way of detecting this and will still send every nth request to the
>dead server. Is there any way of making the DNS server more fault
>tolerant, so that the it can detect the dead web servers and not send
>any more requests there?
>
> I found a possible solution to the problem listed on this site here:
>http://www.presttun.org/kare/DNS/DNS-LB-FT.pdf
>
>Does anyone know about this paper and if so, are there any issues with
>the solution listed here?
>
>We would like to allow session maintenance. Does the DNS round robin
>allow this and what are the possible ways to achieve this? Also, the
>web servers are in different geographical locations. Are there any
>problems with using the DNS round robin approach to load balance the
>servers in different locations? 
>
Load-balancing of web servers or other application-level servers within 
the *same* location -- specifically, the same subnet -- is something 
that can be done with without any DNS involvement, so I'm going to 
consider that off-topic here.

As for the "solution" you cited, let me say that just because someone 
writes something up and publishes it as a PDF, doesn't necessarily mean 
it's a good idea. Running an incoherent DNS zone using web servers as 
nameservers, isn't a particularly smart idea, in my book, especially 
since there are lots of ways in which the web server can become 
unavailable, even though the same box may be serving up DNS just fine.

Commercial load-balancing solutions still suffer from some of the 
limitations of DNS-based load balancing, e.g. having to lower TTLs to 
anti-social levels, but at least they employ robust keep-alive 
mechanisms so that they have a reasonable idea whether the backend 
service they are load-balancing is available *at*an*application*level*, 
as opposed to the whole box being down or unavailable. Many if not most 
of them can do the session maintenance thing too.

                                                                         
                                             - Kevin




More information about the bind-users mailing list