[bind10-dev] transfer problems
Jeremy C. Reed
jreed at isc.org
Wed Apr 27 13:36:34 UTC 2011
The following are some of the issues related to running b10-auth in
production.
I think the use is: a master contains the signed zone (it is not
"hidden" but it also is not part of the delegation). Four NS records.
One of these is the BIND 10 server. I think the three others may be
configured to pull from the new BIND 10 server, but I do not know how
they are configured.
I didn't install the BIND 10. I am pretty sure it is from the May 2011
snapshot tarball. The verbose output says bind10 20110223 for BIND 10
20110322.
It is ran with --verbose and the stderr and stdout go to two
different files. If there is a failure, it is not easy to associate
entries between the two files since most output doesn't have any
timestamp or identifiers.
Xfrin appears to fail every time:
[b10-xfrin] transfer of 'bind10.isc.org.': AXFR started
[b10-xfrin] Error while loading bind10.isc.org.: receive data from
socket time out.
[b10-xfrin] transfer of 'bind10.isc.org.': AXFR failed
(This is logged over 80 times.)
I don't know when the failed is logged as I don't have the very last
one. It would be nice to have real logging with timestamps and PID of
process.
It doesn't indicate when or where. A simple dig against the master for
AXFR does work fine on same system.
The serial number on master and the three other
secondaries is 2011042400. The serial on the BIND 10 server is
2011041300.
This is the second time I have seen the xfrin not working. Early last
week it was serving wrong data. It was restarted since then.
jelte provided minor patch for xfrin.py.in to also output the
self._master_address to know where the timeout was from. I didn't use
this yet.
The verbose output also includes:
--------------------------------------
[b10-auth] received a message:
;; ->>HEADER<<- opcode: NOTIFY, status: NOERROR, id: 13575
;; flags: aa ; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;bind10.isc.org. IN SOA
;; ANSWER SECTION:
bind10.isc.org. 0 IN SOA ns-int.isc.org. hostmaster.isc.org. 2011042400
7200 3600 604800 3600
[b10-auth] received a message:
--------------------------------------
The above makes no sense. Notice the "flags: aa ;"
The answer serial is not the same as sent other times (2011041300). This
specific query doesn't have any corresponding answer sent back. (I don't
see any corresponding "sending a response" for same.)
Why would it receive and log a received message that includes the answer
section?
Maybe this corresponds to it originating a SOA check, but that original
query is not noted in the verbose output.
Note that is the correct serial that all the other auth servers (except
this one) know.
I don't know if we send any response if our xfrin fails. (I didn't do a
capture yet.)
Again the goal of this BIND 10 server is to be the master used by the
other three public auth servers. But the Xfrout.log doesn't indicate
that at all. No notifies received or transfers out logged since was
restarted on April 21.
This is important because it is serving signatures expiring at
20110513201947 (the correct one expires at 20110523233227).
Jeremy C. Reed
ISC
More information about the bind10-dev
mailing list