[bind10-dev] transfer problems

Jeremy C. Reed jreed at isc.org
Wed Apr 27 13:36:34 UTC 2011


The following are some of the issues related to running b10-auth in 
production.

I think the use is: a master contains the signed zone (it is not 
"hidden" but it also is not part of the delegation). Four NS records. 
One of these is the BIND 10 server. I think the three others may be 
configured to pull from the new BIND 10 server, but I do not know how 
they are configured.

I didn't install the BIND 10. I am pretty sure it is from the May 2011 
snapshot tarball. The verbose output says bind10 20110223 for BIND 10 
20110322.

It is ran with --verbose and the stderr and stdout go to two 
different files. If there is a failure, it is not easy to associate 
entries between the two files since most output doesn't have any 
timestamp or identifiers.

Xfrin appears to fail every time:

[b10-xfrin] transfer of 'bind10.isc.org.': AXFR started
[b10-xfrin] Error while loading bind10.isc.org.: receive data from 
socket time out.
[b10-xfrin] transfer of 'bind10.isc.org.': AXFR failed

(This is logged over 80 times.)

I don't know when the failed is logged as I don't have the very last 
one. It would be nice to have real logging with timestamps and PID of 
process. 

It doesn't indicate when or where.  A simple dig against the master for 
AXFR does work fine on same system.

The serial number on master and the three other 
secondaries is 2011042400. The serial on the BIND 10 server is 
2011041300.

This is the second time I have seen the xfrin not working. Early last 
week it was serving wrong data. It was restarted since then.

jelte provided minor patch for xfrin.py.in to also output the 
self._master_address to know where the timeout was from. I didn't use 
this yet.

The verbose output also includes:

--------------------------------------
 [b10-auth] received a message:
;; ->>HEADER<<- opcode: NOTIFY, status: NOERROR, id: 13575
;; flags: aa ; QUESTION: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;bind10.isc.org. IN SOA

;; ANSWER SECTION:
bind10.isc.org. 0 IN SOA ns-int.isc.org. hostmaster.isc.org. 2011042400 
7200 3600 604800 3600

[b10-auth] received a message:
--------------------------------------

The above makes no sense. Notice the "flags: aa ;"

The answer serial is not the same as sent other times (2011041300). This 
specific query doesn't have any corresponding answer sent back. (I don't 
see any corresponding "sending a response" for same.)

Why would it receive and log a received message that includes the answer 
section?

Maybe this corresponds to it originating a SOA check, but that original 
query is not noted in the verbose output.

Note that is the correct serial that all the other auth servers (except 
this one) know.

I don't know if we send any response if our xfrin fails. (I didn't do a 
capture yet.)

Again the goal of this BIND 10 server is to be the master used by the 
other three public auth servers.  But the Xfrout.log doesn't indicate 
that at all. No notifies received or transfers out logged since was 
restarted on April 21.

This is important because it is serving signatures expiring at
20110513201947 (the correct one expires at 20110523233227).


  Jeremy C. Reed
  ISC



More information about the bind10-dev mailing list