Automatic . NS queries from BIND

Wed Jun 17 08:22:08 UTC 2015

On 16/06/2015 01:51, Kevin Oberman wrote:
> On Mon, Jun 15, 2015 at 1:29 PM, Darcy Kevin (FCA)
> <kevin.darcy at fcagroup.com <mailto:kevin.darcy at fcagroup.com>> wrote:
> 
>     Right, we know how hints files are used, but I think you guys may be
>     missing the underlying conundrum: why is named querying the NS
>     records of the root zone more often than the TTL of that RRset? See
>     that there is a “NS? .” query at 15:36:44 and then another one at
>     15:45:52. At 15:45:52 it should have answered its client from the
>     data it cached from the answer to the 15:36:44 query (less than 10
>     minutes previous).____
> 
>     __ __
> 
>     Is named not seeing a response from the root servers in question? Is
>     the max-cache-ttl being capped at a ridiculously-small value?____
> 
>     __ __
> 
>     The NS queries of other names besides “.” itself are red herrings.
>     They are all unique names – dot-terminated octet strings, names in
>     the “.mr” TLD, “comp-HP.” -- and we wouldn’t expect them to have
>     been cached previously. But an answer to “NS? .” should be cached
>     for **days**, not just a few minutes.____
> 
>     __ __
> 
>     I’m speculating that this might not be a pure “caching DNS server”
>     after all; it might be a forwarder with “forward first” defined. In
>     that case, if the forwarding path experiences occasional delays,
>     then named will fail over to trying iterative resolution, and if the
>     routing and/or firewall rules were never set up to allow that, then
>     the symptoms would be as documented, since named would never get a
>     response from the root servers. General rule: use “forward only” if
>     you must use forwarders **exclusively**; “forward first” is only for
>     **opportunistic** forwarding, where you still have the ability to
>     fall back to iterative resolution, if and when necessary.
>     (Personally, I’m not much of a fan of “forward first”, since it
>     rarely if ever produces the performance benefit expected, or, even
>     if it lowers the *average *query latency, it does so at the expense
>     of the *worst-case* latency -- cache miss plus slow authoritative
>     nameservers and/or misconfigured delegations -- and it’s worst-case
>     that causes apps to time out, to break, and ultimately, users to
>     show up bearing pitchforks and burning oil).____
> 
>     __ __
> 
>                                                                                                                                                                    
>     - Kevin
> 
> 
> There is more to than TTL expiry involved. TTL on the root is pretty
> long (60 days). There are also the regular and far more frequent checks
> for fastest response. These are performed according to an algorithm in
> BIND that I have not seen documented. It i possible that these queries
> are responsible, especially as queries are going out to multiple root
> servers.

It sometimes happens that a recursive server 'learns' a new RRset for
the root servers as a result of something like an upward delegation by
an old DNS server.  Clearly this information is not as trustworthy as
the list of root nameservers obtained directly from the root nameservers
themselves - so just in case, named goes back and re-primes its list.

That's usually why the root server list is being refreshed more often
than the TTL.

There's one other reason which is less usual (but might indeed happen on
a server that is predominantly forwarding and not using the roots), and
that is if there is limited cache space, then the root server RRset
could end up being pushed out to make space for newer entries by means
of the LRU (least recently used) algorithm.

But I think my first explanation is more likely in this case.

Cathy