Root zone timeout and workarounds?

denon denon at denon.cx
Tue Feb 20 07:05:31 UTC 2001


At 09:00 PM 2/19/2001 -0500, you wrote:

>When you say the "resolvers" are timing out, do you mean caching nameservers
>doing recursive lookups, or do you mean stub resolvers?

Excuse my lack of terminology .. but here's what's happening, hopefully I'm 
answering your question:

say I have foo.com registered with NSI. I've also registered hosts ns, ns2, 
ns3, ns4, ns5.foo.com.

They're listed on foo.com, at NSI, in that order. NS5 being the off-site, 
ns1-4 being the ones on our network.

When I take ns1-4 down, I pick a random remote nameserver (say, 
ns.yahoo.com), one that I know doesn't have it cached/etc. Then I try to 
resolve SomeRandomArecord.foo.com off it. These resolves are what are 
timing out. It doesn't matter what remote NS I pick, I have similar results 
.. occasionally it'll resolve, usually it times out ..

Am I making sense? I hope so ..


>Perhaps you should consider putting
>the remote server second or third in the list to reduce the possibility of
>timeout.

You're probably right, I guess I was under the impression that the root 
servers picked the nameservers at random (random, weighted by uptime past 
success, I guess).


>In some versions of BIND 8 there was a "rotate" resolver option which
>would cause the stub resolver to rotate the nameserver list for each 
>query. But
>that option appears to be gone as of BIND 9, so I wouldn't rely on it.

Is this an issue with the root servers? Surely they're not running generic 
bind8 .. :)

Thanks for your ideas Kevin. I hope I've clarified things a little.


>denon wrote:
>
> > I've been digging through the archives, usenet as well as a variety of
> > other tech docs in search of the answer for my question.  I haven't come up
> > with any results, but if this is a "frequently asked question", please
> > don't be afraid to throw me to a url.
> >
> > Here's the situation we've got:  I have a situation, where I've got the
> > need for a relatively highly redundant dns system (who doesn't? :). On an
> > Internet domain, as a test, I've listed 5 nameservers. One of the
> > nameservers is at a remote location, and the other 4 are at various places
> > within our internal network.  Due to the fact that the internal network is
> > all geographically in the same area, there's a "good chance" all 4 here
> > would go down at the same time. We don't presently have the facilities for
> > more than one off-site, but I think it's safe to rely on just one.
> >
> > The problem is this: When I take down the 4 internal nameservers (when I
> > say take down, I mean ndc stop, not just drop the zone), the 5th nameserver
> > outside responds just fine. However, I think most resolvers are timing out
> > before it does. Shouldn't the root servers respond faster than the resolver
> > times out? While the 4 are down, if you resolve something 10 times in a
> > row, maybe 6 times it'll time out, and 4 times it'll resolve. (assuming you
> > resolve something different from the same zone each time .. not 
> caching/etc.).
> >
> > Is this a common problem? If all 4 of the internal nameservers go down,
> > will the 5th be of any use?
> >
> > I'd appreciate any insight you can give me, TIA.
> >
> > Best Regards.



More information about the bind-users mailing list