Feature request for dig
Mark K. Pettit
pettit at yahoo-inc.com
Wed Mar 7 03:37:31 UTC 2012
Hi, fellow BIND users.
The other day I was attempting to diagnose a problem on a recursive resolving name server. I had just enabled DNSSEC Validation, and certain digs (such as "www.isc.org", "www.dnssec-failed.org") were failing. Even queries to non-signed domains such my own personal domain (which also happens to be in .org) were failing.
I was testing it with this command line:
dig +dnssec +bufsize=XXXX www.isc.org. a
Where XXXX is our locally-configured edns-udp-size value. This DNS lookup took a long time to finish (it was timing out), and then eventually failed with a timeout error. I tried to see where it was failing with:
dig +dnssec +bufsize=XXXX @127.0.0.1 www.isc.org. A +trace
I got this output before it hung:
; <<>> DiG 9.8.1-P1 <<>> +dnssec +bufsize=XXXX @127.0.0.1 www.isc.org. A +trace
; (1 server found)
;; global options: +cmd
. 506674 IN NS a.root-servers.net.
. 506674 IN NS e.root-servers.net.
. 506674 IN NS l.root-servers.net.
. 506674 IN NS m.root-servers.net.
. 506674 IN NS d.root-servers.net.
. 506674 IN NS g.root-servers.net.
. 506674 IN NS i.root-servers.net.
. 506674 IN NS f.root-servers.net.
. 506674 IN NS h.root-servers.net.
. 506674 IN NS c.root-servers.net.
. 506674 IN NS k.root-servers.net.
. 506674 IN NS b.root-servers.net.
. 506674 IN NS j.root-servers.net.
. 506674 IN RRSIG NS 8 0 518400 20120313000000 20120305230000 51201 . kBn5abbR2172kIhOfAdf38Mi4IpqkclowMxD2BKh2hg3udwGeJfK3YOA I1Pz9lcb/NzFzh+ndVXZERaofryyoeE15ZD0HQxMqLai7HV6nVKQyiPZ vGXA3CsIua9g8dnnN4RNbYrPnM7i6f/hBgKph8/AcFHXAQfRFZIxiJL1 O50=
;; Received 397 bytes from 127.0.0.1#53(127.0.0.1) in 817 ms
I ended up spending an hour or two trying to figure out what was causing it to hang, and in the end, this query was hanging:
dig +dnssec +bufsize=XXXX @a0.org.afilias-nst.info. org. DNSKEY
It turns out that the answer to that query is larger than our "bufsize", so the packet came back truncated, and BIND was re-trying over TCP, but our ACLs weren't set up right to allow that.
My request is this:
Please add something to "dig" that replicates the behavior of BIND as closely as possible with regards to the many queries it issues as part of a DNSSEC-validing resolution.
I ran tcpdump on an unloaded server and captured all DNS query traffic immediately after running "rndc flush", and the queries it asked, in order, were:
Remote server Domain Type
======================= ========================= =======
M.ROOT-SERVERS.NET www.dnssec-failed.org. A
M.ROOT-SERVERS.NET . NS
d0.org.afilias-nst.org www.dnssec-failed.org. A
dns105.comcast.net www.dnssec-failed.org. A
f.root-servers.net . DNSKEY
dns104.comcast.net dnssec-failed.org. DNSKEY
a2.org.afilias-nst.info dnssec-failed.org. DS
b0.org.afilias-nst.org org. DNSKEY
k.root-servers.net org. DS
dns101.comcast.net dnssec-failed.org. DNSKEY
dns105.comcast.net dnssec-failed.org. DNSKEY
dns103.comcast.net dnssec-failed.org. DNSKEY
dns102.comcast.net dnssec-failed.org. DNSKEY
c0.org.afilias-nst.org dns104.comcast.org. AAAA
dns101.comcast.net dns104.comcast.org. AAAA
(The edns-udp-size for this server is 4096.) I realize that some of this traffic might be unusual (such as the query for "dig @M.ROOT-SERVERS.NET . NS"), but the rest of it is normal DNSSEC resolution.
It would be *extremely helpful* if dig printed out the queries it was doing as it was doing them, so I could have seen that it was re-trying a truncated response, and hanging on TCP.
This doesn't even show the fallback-to-TCP that might happen if the edns-udp-size was lower, like in my other location.
I understand I'm asking dig to do what BIND normally does, but since they're both packaged together, it seems like a reasonable request.
Does anyone else see a need for a tool like this?
More information about the bind-users