Round-robin for high availability?

Dave Henderson dhenderson at digital-pipe.com
Sun Jul 16 18:51:00 UTC 2006


Kevin Darcy <kcd at daimlerchrysler.com> wrote:  cdevidal wrote:
> ==== My real address is Chris (AT) deVidal (DOT) tv ====
>
> I've been experimenting with multiple A records for both
> load-distributing AND high availability.
>
> Up until this point I was always told that round-robin is for
> load-distributing ONLY and should not be used for high availability
> failover.  But in practice this is not proving to be true.  I'm
> beginning to think that was just FUD.
>
> Do a lookup on roundrobintest8.strangled.net and
> roundrobintest9.strangled.net.  Notice the A records:
> roundrobintest8.strangled.net. 3600 IN A 127.0.0.1
> roundrobintest8.strangled.net. 3600 IN A 63.95.68.129  # Real server
>
> roundrobintest9.strangled.net. 3600 IN A 10.69.96.69   # Bogus IP
> roundrobintest9.strangled.net. 3600 IN A 63.95.68.129  # Real server
>
> Now, disable anything running on localhost:443 and make sure you do
> *not* have a host at 10.69.96.69.
>
> Browse https://roundrobintest8.strangled.net/ and
> https://roundrobintest9.strangled.net/  You should never get a DNS
> error.  It should always give you first an SSL warning (hostname
> mismatch) and login prompt.  Oh it'll pause while it tries the bad IP
> but after about 5 seconds it flips to the real server.
>
> Now load up an SSL web server on localhost.  I used Apache+mod_ssl on
> Linux and TinySSL on Windows.  Set up an index page with links to
> several other pages.
>
> (Sorry to require SSL, it was the only web server I have control over
> that no one is using at the moment, so I can kill the web service any
> time I want... You could also load up an FTP or SSH server on localhost
> instead of SSL.  My server has all three.)
>
> Flush your cache (e.g. ipconfig /flushdns) and reload the website.
> Sometimes you will get localhost, sometimes my server.  That's the
> load-distributing action we all know and love.
>
> If you don't get localhost, keep flushing your cache until you get it.
> Then kill your server and click on a link in the web page that is still
> up on your screen.  It will fail back to my server and generate a 404.
> That's high availability!  Even though it generates an error, it's
> coming from my server nonetheless!
>
> -No- client I've tried (browser, FTP client, MySQL, SSH etc.) fails on
> the bad IP (10.69.96.69).  It thinks for a few seconds and then tries
> the good IP.
>
> Nor does it fail when the IP is good, as in the case of localhost, but
> no service is listening on that port.
>
>
> I've tried this on:
> Windows 95
> Windows 98
> Windows 2000
> Windows XP
> Ubuntu 6.06
> Debian 3.1
> CentOS 3
> CentOS 4
>
> With these clients:
> Netscape 4.5 (Nice and old!!!)
> IE 5.5
> IE 6
> Firefox 1.0
> Firefox 1.5
> DOS FTP
> Linux FTP
> Linux NcFTP
> MySQL client
> OpenSSH client
>
>
> My idea is to set up a live server running web/mail/DNS/DB/FTP and a
> warm standby, such as:
> www.example.com. 3600 IN A 1.1.1.1
> www.example.com. 3600 IN A 2.2.2.2
>
> The warm standby is powered on but no services are started.  Live is
> synchronized to warm standby.  If the live fails I bring up the
> standby.  Bing bang boom, the client automatically goes to the standby.
>
> It'll be just web/POP/SSH/FTP because DNS and SMTP already have
> built-in load-distributing and high availability capabilities.  No
> database ports will be exposed to the outside world but if I do they
> should work.
>
>
> If this works, so cool!  Replacement for expen$ive and complicated HA
> solutions :-)
>
> Was clued into this by Mr. Tenereillo:
> http://www.tenereillo.com/GSLBPageOfShame.htm
>
>
> What am I missing?  Do I need to do more testing?
>
> Am I crazy?  Or crazy like a fox?  ;-)
>
> Someone check me on this because I'm not sure I'm testing it right...
>
>   
A 5-second delay on half of the accesses is not acceptable to most folks 
in the market for a "high-availability" solution.
                                                                         
                     - Kevin



Wouldn't  high-availability mean that you get what your looking for 99% of the  time versus getting it less (5 seconds of time loss or not)?  In a  situation like the OP has described, the user would get their answer  100% of the time.  Shouldn't that count for high-availability?
  
  Dave




More information about the bind-users mailing list