[bind10-dev] initial ideas on the "difference" design

Wed Oct 12 07:41:28 UTC 2011

In order to support IXFR-out, we need to be able to store and read the
difference between different versions of a zone.  We'll also need that
capability for supporting dynamic update (to be very precise dynamic
update itself does not necessarily require "difference" handling if
the updates can be directly made on a persistent storage, but we may
not always be able to assume that, and it's pretty likely that we want
to use IXFR-out for such zones).  I've been thinking about how to do
this for BIND 10 for a while.  This is an initial brain dump for
discussion.

Whether this design and APIs make sense is the biggest question, but
if they do, my gut feeling is that it won't be so difficult to
implement and it's not impossible to complete this framework and
IXFR-out on top it by the next release (even if we do nothing except
for design discussions for the first two weeks).

What I'm currently thinking makes sense is:
- We represent diffs in a sequence of RRs in the form of IXFR (this is
  derived from BIND 9's journal implementation)
- For database-backended data sources we'll generally store the diffs
  in a separate table of the same database as that for zones and RRs
  (but higher level APIs allow different ways)
- To store diffs, we extend the existing ZoneUpdater class: add a new
  'journaling' parameter to the factory interface (getUpdater()), and
  store diffs as we make updates if it's true.  We'll also provide an
  additional method to ZoneUpdater so that the app can store a set of
  diffs at once.
- To read diffs, we introduce a separate class (tentatively named
  "ZoneJournalReader") and provide a corresponding factory method.
   We create a reader object with the zone name, beginning serial
   number, and ending serial number.  The reader object internally
   works as an iterator over all the diffs from begin to end, and we
   provide an interface to retrieve the "next diff" from the iterator.

This is mostly derived from BIND 9's design, but adjusted to the BIND
10's architecture (e.g., in many cases diffs are managed in the same
database, and the details may vary depending on the specific data
source/database backends).

Some more specific ideas:

As for the diff representation, it's a list of IXFR-style difference
sequences.  Each difference sequence is a sequence of RRs:
pre-transaction SOA, zero or more other deleted RRs, the
post-transaction SOA, and zero or more other added RRs.  Example:

example.com. SOA serial=1
a.example.com. A  192.0.2.1
example.com. SOA serial=2
b.example.com. A  192.0.2.2
c.example.com. A  192.0.2.3

This means a version change from serial 1 to 2 for zone example.com,
deleting a.example.com/A and adding {b,c}.example.com/A.

Exactly how to store the diffs in a database table can be different on
different data source, but one example would be to use this schema:

table diffs (id integer primary key, zone_id integer, version integer, operation integer, name string, rrtype string, [columns for ttl, rdata: omitted])
Where "operation" can have either 0 (delete) or 1 (add) - this can
also be a string, of course.

Then the stored database rows corresponding to the above example would
look like this (assume "zone_id" for example.com is 10):
ID ZID  ver op  name            rrtype
1,  10,   1, 0, "example.com.", "SOA",...
2,  10,   1, 0, "a.example.com.", "A", ...
3,  10,   2, 1, "example.com.", "SOA", ...
4,  10,   2, 1, "b.example.com.", "A", ...
5,  10,   2, 1, "c.example.com.", "A", ...

To retrieve the entire diffs between given two versions (B and E)
for a zone whose zone_id is Z, we'll execute:

select * from diffs where
  zone_id = Z and
  id >= (select id from diffs where version = B and operation = 0
         order by id asc limit 1) 
  and
  id <= (select id from diffs where version = E and operation = 1
         order by id desc limit 1);

Alternatively, we might more heavily rely on the database's own
feature, such as introducing an update trigger so that any
insert/delete operation on the "records" table will be logged in a
separate table.  In this initial idea I didn't choose that path
because it seemed to be more difficult to retrieve diffs in a way we
want (i.e., getting diffs for given two versions in a sorted order),
and because if we allow the database to be "non captive", some
unrelated diffs may also be logged and confuse the retrieval
interface.  But the higher level API should be flexible enough to
allow the lowest-level representation to use this approach.

In my original example of using a specialized table, the RRs might be
stored as binary data for performance reasons.  Right now I'm not sure
if we need such optimization, though.

To show the main points of the intended APIs, here's a hypothetical
(and incomplete) python code fragment:

class ZoneUpdater():
    """We already have this class.  But its constructor nwo has third
    parameter, 'journaling'.  If journaling is True, every change made in
    the updater will be recorded in a persistent storage for the
    corresponding data source.  Unless commit() is performed on the updater,
    changes to the journal are not committed either.  On performing
    commit, the journal data are commited first; the main commit
    operation may or may not succeed.

    """
    def __init__(self, replace, journaling):
        """To be fully flexible, we separate the interfaces for the
        transaction of the main updates (begin_transaction) and the
        transaction for adding diffs (begin_diff_transaction).  But if
        the actual backend uses tables of the same databae, the latter
        could actually be No-OP.

        """
        self.__accessor = ... # this is an accessor to the actual data source
        self.__accessor.begin_transaction(replace)
        self.__journaling = journaling
        if self.__journaling:
            self.__accessor.begin_diff_transaction()

    def add_rrset(self, rrset):
        """Note: the actualy behavior of 'add_diff' may vary in different
        data source backends.  In an extreme case, the backend is very
        smart and journaling may be integrated in 'add_rr'.  In that case
        'add_diff' would be NO-OP.

        """
        for rr in rrset:
            self.__accessor.add_rr(rr)
            if self.__journaling:
                self.__accessor.add_diff(('add', rr))

    # delete_rrset will be extended in a similar way.

    def commit():
        if self.__journaling:
            self.__accessor.commit_diff()
        self.__accessor.commit()

class ZoneJournalReader:
    def __init__(self, zone, begin_serial, end_serial):
        self.__accessor = ... # this is an accessor to the actual data source
        # _get_diffs will internally execute the "select" to retrieve the
        # sequence:
        self.__accessor._get_diffs(begin_serial, end_serial)

    def get_next(self):
        return self.__accessor.get_next_diff(begin_serial, end_serial)

---
JINMEI, Tatuya