NOTIFY(SOA) for zone already xferring

Scott Bertilson scott at nts.umn.edu
Wed Feb 28 03:04:28 UTC 2001


  We have been having periodic, spotty problems with
zones on slave servers not synchronizing from our stealth
master.  This first happened just after installing 8.2.3
on one of our public slaves.  It re-occurred just a few
days ago on a different public slave.  When it happens,
it happens to only some of the zones carried by the slave
even though many other zones are receiving NOTIFY messages
and synchronizing.  As a result of grubbing through the
logs on the master and the slave, here's what happens
(based on the landmarks found in the log):

1) Zone updated via dynamic DNS update to
Stealth Master.
2) SM issues a NOTIFY to slave name servers
listed for zone (FWIW SM doesn't have an NS
record, only slaves).
3) Slave responds by requesting the serial
number of the zone for which it just received
a NOTIFY.
**** This is where the slave appears to lose it.
     It logs:
       NOTIFY(SOA) for zone already xferring (xyzzy.dom.ain)
     for every NOTIFY it receives for the zones
     that aren't being updated.  It never
4) When the serial number comes in from the
SM, slave compares with the serial number
it has in memory.  If the zone needs to be
updated, slave forks/execs "named-xfer" to
retrieve the zone.
5) "named-xfer" queries for the serial number
again to verify that the zone needs to be
transferred (it has been passed the local
serial number as a parameter).
6) If the zone is out of date, "named-xfer"
initiates a zone transfer to pull down the
new zone.
7) The exit status from "named-xfer" tells
slave whether there is a new zone file
waiting to be loaded.
8) Slave loads it and the serial number
is updated.

Info from logs (SM has query logging enabled,
slaves do not):
2) When notify is sent, it is seen on uncooperative
slave as evidenced by the fact that SM logs:
  XX /slave_ip/dom.ain/SOA/IN
and:
  Received NOTIFY answer (AA) from slave_ip for "dom.ain IN SOA"
4) "named-xfer" never started as evidenced by
wrapping it in a script that throws in a log
entry.  The
  NOTIFY(SOA) for zone already xferring (xyzzy.dom.ain)
message doesn't show up for this notify, but
does for the next one to come in.  FWIW z_flags
when this happens is 8043.

The zone had been successfully staying in sync:
a NOTIFY would come in, an AXFR would be started,
it would succeed, and the zone would load.

The zone does not carry an NS for the SM, only
for the slaves.

We can't really run the slaves with debug enabled
because it can take many days before the problem
appears.  Other hints or suggestions would be
most welcome.
				Thanks, Scott


More information about the bind-users mailing list