Managing an Internet outage

Damien Hull dhull at
Mon May 12 03:37:34 UTC 2008

How did you redirect the traffic? Did you use DNS for this?

----- Original Message -----
From: "Dawn Connelly" <dawn.connelly at>
To: "Damien Hull" <dhull at>
Cc: bind-users-bounce at, comp-protocols-dns-bind at, "Mike Diggins" <diggins at>
Sent: Sunday, May 11, 2008 6:09:45 PM GMT -09:00 Alaska
Subject: Re: Managing an Internet outage

My guess is that he is trying to prevent calls to his helpdesk saying "Hey, I can't get to XYZ. Did you know it's down? When is it going to be back up?" If the customer gets an "Oops, we are broken. Please check back in X amount of time." their helpdesk won't get flooded with calls. I know that I have my resources set up so that there is a monitor both on and off the network poling and if the pole fails a specified threshold customers get redirected to an oops page. It significantly cut down on our BS calls to the helpdesk. It also makes it look more professional to your them that warm fuzzy "proactive" feel. 

On Sun, May 11, 2008 at 4:56 PM, Damien Hull < dhull at > wrote: 

This might be a dumb question but won't people know you are down when they can't access the website? 

I think messing with DNS every time the net connection goes down is asking for trouble. If you make a mistake you could be down for hours. Think of all the email you would loose. 

On the other hand you would only need to change the A record for the website. Assuming you have access to a master DNS server. 

The real trouble is that the TTL would have to be set to something like 30 minutes or less. Other wise you won't get things switch over in time. You might have people going to the backup site when your real website is back online. 

Again, I think messing with DNS is not a good idea. 

----- Original Message ----- 
From: "Damien Hull" < dhull at > 
To: "Dawn Connelly" < dawn.connelly at > 
Cc: comp-protocols-dns-bind at , "Mike Diggins" < diggins at > 
Sent: Sunday, May 11, 2008 2:28:06 PM GMT -09:00 Alaska 
Subject: Re: Managing an Internet outage 

Why not go with a master and slave DNS configuration? This is the way DNS should work. 

1. The master DNS server is the one that updates all the other DNS servers 
2. The save gets it's info from the master 
3. Any changes on the master get pushed or pulled to the slave DNS servers 
4. Place your save DNS servers off site 

If your internet connection goes down you don't need to do anything to the slave DNS server. It's got the correct info. 

I'm assuming you have a backup email server off site as well. Assuming your MX records are correct the backup email server will start receiving email. 

Your website won't be available while your internet connection is down but I don't see that as a big deal. Unless you are providing something critical. 

When your internet connection comes back you will start receiving email and your website will be available. Any email that was on the backup email server will be delivered to your main email server. 

This is the way things should be configured. 

----- Original Message ----- 
From: "Dawn Connelly" < dawn.connelly at > 
To: "Mike Diggins" < diggins at > 
Cc: comp-protocols-dns-bind at 
Sent: Sunday, May 11, 2008 12:46:52 PM GMT -09:00 Alaska 
Subject: Re: Managing an Internet outage 

Best practice is to always make sure that your authoritative DNS servers are 
on physically different networks so your boss is right in thinking this 
needs to happen. Couple things to consider. If your master DNS server is 
down, you'll need to reconfigure the offsite machine to be primary so you 
can change the DNS resolution. Not a big deal but make sure to include that 
step in your DR plan. You have control over your TTLs. You can drop them to 
10 minutes (or whatever your SLAs dictate) in the event of a network outage 
so you can recover faster but not always have the increased load. Mail will 
queue on the email servers that are trying to send it for awhile if it can't 
contact your mail server so that buys you some time too. You might want to 
leave your MX resolution to the correct machine IP address even in your 
failure state to make sure that mail queues on the remote end and to make 
sure it sends as soon as the network is back up. It would be better if you 
had an email server as your DR site with a higher weight though from a best 
practice stand point. Also some ISPs tend to just cache one authoritative 
DNS server and continually try to hit it over and over even if it's down. 
The only thing you can do to fix that is ask the ISP to clear their cache. 
Road Runner has burned me with that multiple times. 
So your DR plan would look something like this: 
Network outage is detected. 
Stand-by named.conf file swapped on offsite machine to reference outage zone 
files and configure machine as master 
Outage zone files include the following records: 
@ 600 IN A <IP address of "We are broken" webserver> 
@ 3600 IN MX 10 <IP address of email server> 
* 600 IN A <IP address of "We are broken" webserver> 

Once failure has been cleared, stand-by named.conf is swapped back with 
original file and named is restarted. 

You can script this to happen automatically if you have a monitoring system 
in place with some peril scripts or you can do it manually. You can also 
look at products that do all of this for you automagically. The Global 
Traffic Manager by F5 (Big-IP GTM) is the one I'm most familiar with but I'm 
sure other's on this list could give other examples too. The GTM box will 
continually test access to your resources and as soon as they become 
unavailable they will hand out whatever information you have configured as 
your fallback IP address. 

On Sun, May 11, 2008 at 12:31 PM, Mike Diggins < diggins at > wrote: 


> We occasionally have a situation where our Internet access is completely 
> down. My Manager has asked about the viability of locating a DNS server 
> off site, and during a situation when we're down, modifying it so that it 

> resolves my entire domain to a single IP address. Web users would be 
> redirected to that address, and a web page would explain we're off line. 

> Our DNS TTL is set to 1 hour, however, I'm concerned that sites might 
> cache that address for longer than the TTL, and affect things such as mail 
> delivery beyond the outage. Does anyone have an opinion on this plan? 
> Obviously improving our redundancy is a better solution, and that will 
> come in time. Right now this seems like a quick and easy (dirty) solution. 
> -Mike 

More information about the bind-users mailing list