timeouts and negative caching

Thu Jun 11 14:37:11 UTC 2015

I'm not sure if BIND has a separate tunable for the "timeout vs true
negative answer" scenario you seem to describe, but have you tried setting
max-ncache-ttl very low to see if it affects this?

On 6/11/15, 9:27 AM, "Gerd v. Egidy" <lists at egidy.de> wrote:

>Hi,
>
>I've got a bind running as recursive resolver behind a thin internet
>line. 
>When the line is clogged, requests sometimes time out. When the dns
>client 
>retries the query, bind usually retries the request and eventually
>succeeds. 
>So far so good.
>
>But now I sometimes see that bind does not retry immediately, but somehow
>caches the error for up to 5 minutes (300 secs). The negative answer is
>then 
>given right away, without checking again if the remote server can be
>reached 
>now.
>
>Here is an example:
>
>> time dig www.strato.com
>; <<>> DiG 9.9.3-P2-RedHat-9.9.3-2.P2.i2n <<>> @localhost www.strato.de
>; (1 server found)
>;; global options: +cmd
>;; Got answer:
>;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 43535
>;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
>
>;; OPT PSEUDOSECTION:
>; EDNS: version: 0, flags:; udp: 4096
>;; QUESTION SECTION:
>;www.strato.de.                 IN      A
>
>;; Query time: 4397 msec
>;; SERVER: 127.0.0.1#53(127.0.0.1)
>;; WHEN: Thu Jun 11 14:14:17 CEST 2015
>;; MSG SIZE  rcvd: 42
>
>real    0m0.007s
>user    0m0.004s
>sys     0m0.000s
>
>When I look into the bind cache I see this:
>
>> rndc dumpdb -all
>> cat cache_dump.db
>[...]
>; authauthority
>strato.de.              85530   NS      ns3.strato.de.
>                        85530   NS      ns4.strato.de.
>                        85530   NS      ns1.strato.de.
>                        85530   NS      ns2.strato.de.
>; additional
>ns1.strato.de.          85530   A       193.141.40.1
>; additional
>ns2.strato.de.          85530   A       81.169.144.234
>; additional
>ns3.strato.de.          85530   A       195.122.141.2
>; additional
>                        85530   AAAA    2a00:e10:2004::2
>; additional
>ns4.strato.de.          85530   A       192.166.192.4
>; additional
>                        85530   AAAA    2a01:238:e100:192::4
>[...]
>;
>; Address database dump
>;
>[...]
>; ns2.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
>; ns3.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
>; ns4.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
>; ns1.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
>
>I've seen this "[v4 TTL 59]" go up to 300.
>
>So there must be some kind of "negative caching" which caches timeouts
>and,
>not like the real negative caching, just active negative results.
>
>Where do these 300 seconds come from and how can I configure them? I'd
>like to 
>drastically reduce them to something like 10 seconds or so to make sure
>bind 
>retries to resolve a query shortly after a timeout.
>
>Thank you.
>
>Kind regards,
>
>Gerd
>
>_______________________________________________
>Please visit https://lists.isc.org/mailman/listinfo/bind-users to
>unsubscribe from this list
>
>bind-users mailing list
>bind-users at lists.isc.org
>https://lists.isc.org/mailman/listinfo/bind-users