[bind10-dev] initial ideas on the "difference" design
Kevin Tes
xiejiagui at cnnic.cn
Thu Oct 13 02:20:37 UTC 2011
To support IXFR-out, we should consider IXFR-in too. I think these
two have the same needs.
For 'masters' Synchronizes data with 'slaves' or 'slaves' ask for
synchronizing data, as those happen, bind needs the diffs data.
Example:
begins of the zones:
test soa 1
example.test ns ns1.example.test
b.test a 1.1.1.1
ns1.example.test a 1.1.1.2
=======================================================================
for update a zone:
nsupdate : remove example.test ns ns1.example.test
remove ns1.example.test a 1.1.1.2
add example.test ns ns2.example.test
add ns2.example.test A 1.1.1.3
// soa + 1
----------------------------------------------------------------------
or add example.test ns ns3.example.test
add ns2.example.test A 1.1.1.5
// soa + 1
-----------------------------------------------------------------------
or remove b.test a 1.1.1.1
add b.test a 1.1.1.4
remove b.test a 1.1.1.4
add b.test a 1.1.1.1
// soa + 1
-----------------------------------------------------------------------
now the zone is
test soa 4
example.test ns ns2.example.test
ns ns3.example.test
b.test a 1.1.1.1
ns2.example.test a 1.1.1.3
ns2.example.test a 1.1.1.5
we just need to store the data each time user changed the zone in order
so when synchronizing the data, slaves just needs to traversal the
changed data for soa 1 to soa 4.
=======================================================================
Should we consider when delete a zone, how the slaves synchronizes with
the masters??
=======================================================================
On Wed, 2011-10-12 at 00:41 -0700, JINMEI Tatuya / 神明達哉 wrote:
> In order to support IXFR-out, we need to be able to store and read the
> difference between different versions of a zone. We'll also need that
> capability for supporting dynamic update (to be very precise dynamic
> update itself does not necessarily require "difference" handling if
> the updates can be directly made on a persistent storage, but we may
> not always be able to assume that, and it's pretty likely that we want
> to use IXFR-out for such zones). I've been thinking about how to do
> this for BIND 10 for a while. This is an initial brain dump for
> discussion.
>
> Whether this design and APIs make sense is the biggest question, but
> if they do, my gut feeling is that it won't be so difficult to
> implement and it's not impossible to complete this framework and
> IXFR-out on top it by the next release (even if we do nothing except
> for design discussions for the first two weeks).
>
> What I'm currently thinking makes sense is:
> - We represent diffs in a sequence of RRs in the form of IXFR (this is
> derived from BIND 9's journal implementation)
> - For database-backended data sources we'll generally store the diffs
> in a separate table of the same database as that for zones and RRs
> (but higher level APIs allow different ways)
> - To store diffs, we extend the existing ZoneUpdater class: add a new
> 'journaling' parameter to the factory interface (getUpdater()), and
> store diffs as we make updates if it's true. We'll also provide an
> additional method to ZoneUpdater so that the app can store a set of
> diffs at once.
> - To read diffs, we introduce a separate class (tentatively named
> "ZoneJournalReader") and provide a corresponding factory method.
> We create a reader object with the zone name, beginning serial
> number, and ending serial number. The reader object internally
> works as an iterator over all the diffs from begin to end, and we
> provide an interface to retrieve the "next diff" from the iterator.
>
> This is mostly derived from BIND 9's design, but adjusted to the BIND
> 10's architecture (e.g., in many cases diffs are managed in the same
> database, and the details may vary depending on the specific data
> source/database backends).
>
> Some more specific ideas:
>
> As for the diff representation, it's a list of IXFR-style difference
> sequences. Each difference sequence is a sequence of RRs:
> pre-transaction SOA, zero or more other deleted RRs, the
> post-transaction SOA, and zero or more other added RRs. Example:
>
> example.com. SOA serial=1
> a.example.com. A 192.0.2.1
> example.com. SOA serial=2
> b.example.com. A 192.0.2.2
> c.example.com. A 192.0.2.3
>
> This means a version change from serial 1 to 2 for zone example.com,
> deleting a.example.com/A and adding {b,c}.example.com/A.
>
> Exactly how to store the diffs in a database table can be different on
> different data source, but one example would be to use this schema:
>
> table diffs (id integer primary key, zone_id integer, version integer, operation integer, name string, rrtype string, [columns for ttl, rdata: omitted])
> Where "operation" can have either 0 (delete) or 1 (add) - this can
> also be a string, of course.
>
> Then the stored database rows corresponding to the above example would
> look like this (assume "zone_id" for example.com is 10):
> ID ZID ver op name rrtype
> 1, 10, 1, 0, "example.com.", "SOA",...
> 2, 10, 1, 0, "a.example.com.", "A", ...
> 3, 10, 2, 1, "example.com.", "SOA", ...
> 4, 10, 2, 1, "b.example.com.", "A", ...
> 5, 10, 2, 1, "c.example.com.", "A", ...
>
> To retrieve the entire diffs between given two versions (B and E)
> for a zone whose zone_id is Z, we'll execute:
>
> select * from diffs where
> zone_id = Z and
> id >= (select id from diffs where version = B and operation = 0
> order by id asc limit 1)
> and
> id <= (select id from diffs where version = E and operation = 1
> order by id desc limit 1);
>
> Alternatively, we might more heavily rely on the database's own
> feature, such as introducing an update trigger so that any
> insert/delete operation on the "records" table will be logged in a
> separate table. In this initial idea I didn't choose that path
> because it seemed to be more difficult to retrieve diffs in a way we
> want (i.e., getting diffs for given two versions in a sorted order),
> and because if we allow the database to be "non captive", some
> unrelated diffs may also be logged and confuse the retrieval
> interface. But the higher level API should be flexible enough to
> allow the lowest-level representation to use this approach.
>
> In my original example of using a specialized table, the RRs might be
> stored as binary data for performance reasons. Right now I'm not sure
> if we need such optimization, though.
>
> To show the main points of the intended APIs, here's a hypothetical
> (and incomplete) python code fragment:
>
> class ZoneUpdater():
> """We already have this class. But its constructor nwo has third
> parameter, 'journaling'. If journaling is True, every change made in
> the updater will be recorded in a persistent storage for the
> corresponding data source. Unless commit() is performed on the updater,
> changes to the journal are not committed either. On performing
> commit, the journal data are commited first; the main commit
> operation may or may not succeed.
>
> """
> def __init__(self, replace, journaling):
> """To be fully flexible, we separate the interfaces for the
> transaction of the main updates (begin_transaction) and the
> transaction for adding diffs (begin_diff_transaction). But if
> the actual backend uses tables of the same databae, the latter
> could actually be No-OP.
>
> """
> self.__accessor = ... # this is an accessor to the actual data source
> self.__accessor.begin_transaction(replace)
> self.__journaling = journaling
> if self.__journaling:
> self.__accessor.begin_diff_transaction()
>
> def add_rrset(self, rrset):
> """Note: the actualy behavior of 'add_diff' may vary in different
> data source backends. In an extreme case, the backend is very
> smart and journaling may be integrated in 'add_rr'. In that case
> 'add_diff' would be NO-OP.
>
> """
> for rr in rrset:
> self.__accessor.add_rr(rr)
> if self.__journaling:
> self.__accessor.add_diff(('add', rr))
>
> # delete_rrset will be extended in a similar way.
>
> def commit():
> if self.__journaling:
> self.__accessor.commit_diff()
> self.__accessor.commit()
>
> class ZoneJournalReader:
> def __init__(self, zone, begin_serial, end_serial):
> self.__accessor = ... # this is an accessor to the actual data source
> # _get_diffs will internally execute the "select" to retrieve the
> # sequence:
> self.__accessor._get_diffs(begin_serial, end_serial)
>
> def get_next(self):
> return self.__accessor.get_next_diff(begin_serial, end_serial)
>
> ---
> JINMEI, Tatuya
> _______________________________________________
> bind10-dev mailing list
> bind10-dev at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind10-dev
More information about the bind10-dev
mailing list