[bind10-dev] Is AXFR/IXFR needed for DB-based (especially large) zones?

JINMEI Tatuya / 神明達哉 jinmei at isc.org
Sat Jun 11 00:32:10 UTC 2011


I have questions:

What would people expect for synchronizing DB-based servers?
Especially for very large zones such as some of the higher level
domains?  (Would those operators be interested in using DB-based zones
for that purpose in the first place?)

A most straightforward answer would be to implement and use AXFR/IXFR
on top of these zones, but as I said in a separate thread, AXFR(-out)
doesn't seem to be realistic for a huge zone:

 - to ensure the consistency the primary server (the "server" side of
   AXFR) needs to lock the database (or at least the relevant table)
   until it gets all records of a particular serial to be transferred.
 - if the primary server does this while sending AXFR responses, it
   may take very long time (like tens of minutes), depending on the
   size of zone, network bandwidth, responsiveness of the secondary
   server.  During that period no updates can be made on the database.
 - the primary server could first retrieve all records in-memory, at
   which point the lock can be released, and then send AXFR responses
   from the in-memory copy.  This may still take long time (like
   several minutes), depending on (again) the size of the zone and
   communication overhead of the DNS and DB servers.  For a huge zone,
   there's another concern of whether available memory is sufficient.

Looking at other implementations, BIND (9) DLZ insists that it's a bad
practice: http://bind-dlz.sourceforge.net/worst_practices.html and
database replication should be used instead.

DLZ supports AXFR, too, but if I understand the code correctly, it
simply takes the in-memory approach, so it's not scalable for larger
zones.  PowerDNS's AXFR support basically seems to be the same (I
don't know what's the recommended practice for server synchronization
when using PowerDNS).

I'm not even sure whether we should try to solve this problem in the
first place.  If operators of very large zones are not interested in
running DB-based backend or in supporting AXFR, we might simply say
"the AXFR support isn't designed for large zones; please don't use it
if your zone is large".  I'd be very much interested in hearing what
operators in this list think.

(In theory IXFR-out could have the same problem because the size of
diff could be very big, even for a single bump of serial.  But in
practice I guess there are more possibility of work this around with
IXFR, so I mainly focused on AXFR above).

To answer Jelte's related question:

At Tue, 07 Jun 2011 11:05:48 +0200,
Jelte Jansen <jelte at isc.org> wrote:

> I think we should let this depend on the specific backend then;
> loading the entire zone into memory for an axfr (or big ixfr) isn't
> feasible imo, so if at all possible we should have a smart iterator
> (that reads in separate chunks at a time). If this is not possible
> with any specific backend, or if locking would cause problems, we
> can read it at once and then serve it. And document that you should
> not use that specific backend for large zones.

I'm not a DB expert, so it's quite possible I miss some options, but I
can't think of the availability or a way to implement of such a smart
iterator for DB.  We might be able to do that by designing the record
table in a tricky way (e.g. including version information in the
records, marking "deleted" records without actually deleting them
while someone is reading that version of records, etc), but I'm not
sure if it's feasible or is worth the complexity.

As you indicated, another option is to declare that as "out of scope",
especially if operators are not interested in using it that way.

---
JINMEI, Tatuya



More information about the bind10-dev mailing list