Managing an Internet outage

Dawn Connelly dawn.connelly at gmail.com
Mon May 12 03:38:17 UTC 2008


Yup.
On Sun, May 11, 2008 at 8:37 PM, Damien Hull <dhull at digital-overload.net>
wrote:

> How did you redirect the traffic? Did you use DNS for this?
>
> ----- Original Message -----
> From: "Dawn Connelly" <dawn.connelly at gmail.com>
> To: "Damien Hull" <dhull at digital-overload.net>
> Cc: bind-users-bounce at isc.org, comp-protocols-dns-bind at isc.org, "Mike
> Diggins" <diggins at mcmaster.ca>
> Sent: Sunday, May 11, 2008 6:09:45 PM GMT -09:00 Alaska
> Subject: Re: Managing an Internet outage
>
> My guess is that he is trying to prevent calls to his helpdesk saying
> "Hey, I can't get to XYZ. Did you know it's down? When is it going to be
> back up?" If the customer gets an "Oops, we are broken. Please check back in
> X amount of time." their helpdesk won't get flooded with calls. I know that
> I have my resources set up so that there is a monitor both on and off the
> network poling and if the pole fails a specified threshold customers get
> redirected to an oops page. It significantly cut down on our BS calls to the
> helpdesk. It also makes it look more professional to your customers...gives
> them that warm fuzzy "proactive" feel.
>
>
>
> On Sun, May 11, 2008 at 4:56 PM, Damien Hull < dhull at digital-overload.net> wrote:
>
>
> This might be a dumb question but won't people know you are down when they
> can't access the website?
>
> I think messing with DNS every time the net connection goes down is asking
> for trouble. If you make a mistake you could be down for hours. Think of all
> the email you would loose.
>
> On the other hand you would only need to change the A record for the
> website. Assuming you have access to a master DNS server.
>
> The real trouble is that the TTL would have to be set to something like 30
> minutes or less. Other wise you won't get things switch over in time. You
> might have people going to the backup site when your real website is back
> online.
>
> Again, I think messing with DNS is not a good idea.
>
>
> ----- Original Message -----
> From: "Damien Hull" < dhull at digital-overload.net >
> To: "Dawn Connelly" < dawn.connelly at gmail.com >
> Cc: comp-protocols-dns-bind at isc.org , "Mike Diggins" < diggins at mcmaster.ca>
> Sent: Sunday, May 11, 2008 2:28:06 PM GMT -09:00 Alaska
> Subject: Re: Managing an Internet outage
>
>
> Why not go with a master and slave DNS configuration? This is the way DNS
> should work.
>
> 1. The master DNS server is the one that updates all the other DNS servers
> 2. The save gets it's info from the master
> 3. Any changes on the master get pushed or pulled to the slave DNS servers
> 4. Place your save DNS servers off site
>
> If your internet connection goes down you don't need to do anything to the
> slave DNS server. It's got the correct info.
>
> I'm assuming you have a backup email server off site as well. Assuming
> your MX records are correct the backup email server will start receiving
> email.
>
> Your website won't be available while your internet connection is down but
> I don't see that as a big deal. Unless you are providing something critical.
>
> When your internet connection comes back you will start receiving email
> and your website will be available. Any email that was on the backup email
> server will be delivered to your main email server.
>
> This is the way things should be configured.
>
>
>
>
>
> ----- Original Message -----
> From: "Dawn Connelly" < dawn.connelly at gmail.com >
> To: "Mike Diggins" < diggins at mcmaster.ca >
> Cc: comp-protocols-dns-bind at isc.org
> Sent: Sunday, May 11, 2008 12:46:52 PM GMT -09:00 Alaska
> Subject: Re: Managing an Internet outage
>
> Best practice is to always make sure that your authoritative DNS servers
> are
> on physically different networks so your boss is right in thinking this
> needs to happen. Couple things to consider. If your master DNS server is
> down, you'll need to reconfigure the offsite machine to be primary so you
> can change the DNS resolution. Not a big deal but make sure to include
> that
> step in your DR plan. You have control over your TTLs. You can drop them
> to
> 10 minutes (or whatever your SLAs dictate) in the event of a network
> outage
> so you can recover faster but not always have the increased load. Mail
> will
> queue on the email servers that are trying to send it for awhile if it
> can't
> contact your mail server so that buys you some time too. You might want to
> leave your MX resolution to the correct machine IP address even in your
> failure state to make sure that mail queues on the remote end and to make
> sure it sends as soon as the network is back up. It would be better if you
> had an email server as your DR site with a higher weight though from a
> best
> practice stand point. Also some ISPs tend to just cache one authoritative
> DNS server and continually try to hit it over and over even if it's down.
> The only thing you can do to fix that is ask the ISP to clear their cache.
> Road Runner has burned me with that multiple times.
> So your DR plan would look something like this:
> Network outage is detected.
> Stand-by named.conf file swapped on offsite machine to reference outage
> zone
> files and configure machine as master
> Outage zone files include the following records:
> @ 600 IN A <IP address of "We are broken" webserver>
> @ 3600 IN MX 10 <IP address of email server>
> * 600 IN A <IP address of "We are broken" webserver>
>
> Once failure has been cleared, stand-by named.conf is swapped back with
> original file and named is restarted.
>
> You can script this to happen automatically if you have a monitoring
> system
> in place with some peril scripts or you can do it manually. You can also
> look at products that do all of this for you automagically. The Global
> Traffic Manager by F5 (Big-IP GTM) is the one I'm most familiar with but
> I'm
> sure other's on this list could give other examples too. The GTM box will
> continually test access to your resources and as soon as they become
> unavailable they will hand out whatever information you have configured as
> your fallback IP address.
>
>
>
>
>
> On Sun, May 11, 2008 at 12:31 PM, Mike Diggins < diggins at mcmaster.ca >
> wrote:
>
> >
>
> > We occasionally have a situation where our Internet access is completely
> > down. My Manager has asked about the viability of locating a DNS server
> > off site, and during a situation when we're down, modifying it so that
> it
>
> > resolves my entire domain to a single IP address. Web users would be
> > redirected to that address, and a web page would explain we're off line.
> >
>
>
>
> > Our DNS TTL is set to 1 hour, however, I'm concerned that sites might
> > cache that address for longer than the TTL, and affect things such as
> mail
> > delivery beyond the outage. Does anyone have an opinion on this plan?
> > Obviously improving our redundancy is a better solution, and that will
> > come in time. Right now this seems like a quick and easy (dirty)
> solution.
> >
> > -Mike
> >
> >
>
>
>
>
>
>
>
>




More information about the bind-users mailing list