Lame server in cache in Bind 9.2.1 ?

Ben Bridges ben at springnet.net
Fri May 6 18:30:05 UTC 2005


Greetings.

We are running two Bind 9.2.1 name servers on Solaris.  We are having
trouble with a particular domain -- sbj.net.  I know there is a problem
with the domain.  The root servers think that ns1-auth.sprintlink.net and
ns1.corpranet.net are supposed to be the authoritative servers for the
domain, whereas ns1.corpranet.net and ns1.positech.net are apparently
*supposed* to be the authoritative servers, and ns1-auth.sprintlink.net
indicates that it is *not* authoritative for sbj.net.

If I flush the cache (rndc flush) on our servers, they will successfully
resolve the A record for sbj.net.  A dump of the database at that point
shows that our servers have cached ns1.corpranet.net and ns1.positech.net
as nameservers for sbj.net:

----------------------------------------------------------------
; authauthority
sbj.net.                3554    NS      ns1.positech.net.
                        3554    NS      ns1.corpranet.net.
; authanswer
                        3554    A       69.27.136.10
; authanswer
www.sbj.net.            3554    CNAME   sbj.net.
----------------------------------------------------------------


After the NS records for sbj.net time out (1 hour), our servers then
return SERVFAIL for sbj.net.  A network sniff indicates that our servers
are returning SERVFAIL without even trying to query any other nameservers
for sbj.net.  A dump of the database at that point shows that our servers
have cached ns1.corpranet.net and ns1-auth.sprintlink.net as nameservers
for sbj.net:

------------------------------------------------------------------
; glue
sbj.NET.                155685  NS      ns1.corpranet.net.
                        155685  NS      ns1-auth.sprintlink.net.
; glue
sbs2003.NET.            149675  NS      ns1.sbs2003.net.
------------------------------------------------------------------


My questions are:

1) Why do our servers sometime cache ns1.corpranet.net and
ns1.positech.net as the nameservers for sbj.net, and why do they sometimes
cache ns1.corpranet.net and ns1-auth.sprintlink.net instead?  Why are they
not consistent?

2) *Should* our nameservers be caching ns1-auth.sprintlink.net as a
nameserver for sbj.net, since that server is lame for sbj.net?

3) If the answer to (2) is yes, is there any way to configure our servers
to keep them from caching lame servers (JUST the lame servers without
affecting caching for anything else)?

4) Why are our nameservers returning SERVFAIL when ns1-auth.sprintlink.net
is in the cache, since ns1.corpranet.net is also in the cache and is
authoritative for sbj.net.  (In other words, why don't our servers go
ahead and try to query ns1.corpranet.net even though
ns1-auth.sprintlink.net is lame for sbj.net?)

I've googled and searched through the bind-user archives but have so far
not found the answers to my questions.

Thanks.

Ben Bridges
Network Engineer
SpringNet / City Utilities of Springfield, MO








More information about the bind-users mailing list