timeouts and negative caching
Gerd v. Egidy
lists at egidy.de
Thu Jun 11 13:27:09 UTC 2015
Hi,
I've got a bind running as recursive resolver behind a thin internet line.
When the line is clogged, requests sometimes time out. When the dns client
retries the query, bind usually retries the request and eventually succeeds.
So far so good.
But now I sometimes see that bind does not retry immediately, but somehow
caches the error for up to 5 minutes (300 secs). The negative answer is then
given right away, without checking again if the remote server can be reached
now.
Here is an example:
> time dig www.strato.com
; <<>> DiG 9.9.3-P2-RedHat-9.9.3-2.P2.i2n <<>> @localhost www.strato.de
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 43535
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.strato.de. IN A
;; Query time: 4397 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Jun 11 14:14:17 CEST 2015
;; MSG SIZE rcvd: 42
real 0m0.007s
user 0m0.004s
sys 0m0.000s
When I look into the bind cache I see this:
> rndc dumpdb -all
> cat cache_dump.db
[...]
; authauthority
strato.de. 85530 NS ns3.strato.de.
85530 NS ns4.strato.de.
85530 NS ns1.strato.de.
85530 NS ns2.strato.de.
; additional
ns1.strato.de. 85530 A 193.141.40.1
; additional
ns2.strato.de. 85530 A 81.169.144.234
; additional
ns3.strato.de. 85530 A 195.122.141.2
; additional
85530 AAAA 2a00:e10:2004::2
; additional
ns4.strato.de. 85530 A 192.166.192.4
; additional
85530 AAAA 2a01:238:e100:192::4
[...]
;
; Address database dump
;
[...]
; ns2.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
; ns3.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
; ns4.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
; ns1.strato.de [v4 TTL 59] [v4 failure] [v6 unexpected]
I've seen this "[v4 TTL 59]" go up to 300.
So there must be some kind of "negative caching" which caches timeouts and,
not like the real negative caching, just active negative results.
Where do these 300 seconds come from and how can I configure them? I'd like to
drastically reduce them to something like 10 seconds or so to make sure bind
retries to resolve a query shortly after a timeout.
Thank you.
Kind regards,
Gerd
More information about the bind-users
mailing list