[bind10-dev] Datasource API

Tue Jan 5 19:51:39 UTC 2010

Jelte,

On Tue, 2010-01-05 at 16:59 +0100, Jelte Jansen wrote:

> Here's a general throw at something to get us started; example header is
> based on the current name API header files in the parkinglot branch
> (since that is where i plan on prototyping this, and jinmei can tell us
> how his current work will let us do some of this better and the rest of
> us can remark where there is room for improvement that hasn't been noted
> yet)

Excellent!

> For starters, let me be clear on what is my view of what we call the
> data source; in essence i think of it as an interface, that if you
> implement it, can be used by our future authoritative server module to
> use as the backend. This means that it needs to be able to provide DNS
> data in a form that our module understands, regardless of how it stores
> that data itself. Also having some fixed way to import and update zone
> data would be nice (though imho, this should not be mandatory for every
> backend, as those may have methods for updating themselves).

Right. We also may want to implement read-only back-ends (for example).
Programs that want to use the API to update data sources need to have a
way to check whether imports/updates are possible. (Is there a
difference? Can we reasonably envision a data source that we can import
to but not then subsequently update?)

> So at the minimum the class will need virtual functions for getting
> data, adding data, and removing data (sound familiar? heh). There will
> probably be a few different kinds of data in that basic set of methods.
> These lowlevel methods need to be defined by the specific data source
> implementation classes.

Yes.

> On top of that, the plan is to make more advanced methods that use those
> lowlevel ones to actually get from a query to an answer, and to do the
> more magical stuff that happens when updates occur. The basic idea is
> that if you write a new back-end, you start out by implementing the
> low-level ones, which already gives you a working data source, albeit
> one that may not be very efficient. You then have the option to also
> redefine the high-level functions to make use of shortcuts your data
> model allows you to.

Yes.

> So here's a first stab at a base class that would provide at least some
> of what we need; there's probably a lot to add but let's hope this gets
> things started.
> 
> Based on this, i think i could get a DataSourceParkingLot done pretty
> quickly, but if anyone thinks this is completely not the way to go,
> please let me know before tomorrow ;)
> 
> Jelte
> 
> ps. we'll probably need a more advanced getRRset(), and i have not
> thought about how we could do history for ixfr at all yet

We spent quite a few cycles talking about this at the first BIND 10
face-to-face meeting. Fundamentally we need something like a transaction
for any data source that can be updated.

An API could look a lot like BIND 9, which basically creates an update
context and locks the database AIUI. This can be BEGIN/COMMIT in an SQL
database... although probably we want to insure we have the correct
transaction isolation level (serializable is best, but possibly other
levels are acceptable and give much improved performance... SQLite does
serializable by default, PostgreSQL and MySQL need a bit of convincing):

http://www.sqlite.org/sharedcache.html
http://www.postgresql.org/docs/8.4/interactive/transaction-iso.html
http://dev.mysql.com/doc/refman/5.5/en/set-transaction.html

We can make serials either implicit or explicit in this. I think keeping
them implicit makes sense as it allows concurrent processing, although
we may then need to add some special handling if the SOA itself is
modified.

An API could also perhaps look a bit like a DDNS update, where you
collect all of your operations and then submit it at once. I'm not sure
how practical this is, because we would need to collect a set of reads
together without knowing exactly what they are (I think). So unless
someone wants to explore this I think we need to go with the transaction
model for updates.

A few comments inline below...

> #ifndef __DATA_SOURCE_H
> #define __DATA_SOURCE_H
> 
> #include "name.h"
> #include "rrset.h"
> 
> namespace isc {
> namespace dns {
> 
> // Base class for a DNS Data Source
> class DataSource {
> public:
>     enum result { success, not_implemented, error, zone_not_found,
>                   name_not_found };
> 
>     DataSource() {};
>     virtual ~DataSource() {};
> 
>     //
>     // 'high-level' methods, these may or may not be overwritten;
>     // depending on the data source the default implementation
>     // may not be most efficient, but should at least provide
>     // the needed functionality
>     //

We need some sort of transaction, as mentioned above. We may want to
always have this for every operation, although conceivably some
shortcuts are nice. So, basically a factory method as part of the
DataSource:

    Transaction beginTrans();

In this case, we move almost all operations out of the DataSource class
and into the Transaction class...

class Transaction {
public:
    // these raise exceptions presumably
    void commit();    
    void rollback();

>     // fill in separate lists or simple fill in an answer Message object?
>     result getData(const RRsetPtr query, Message& answer);
> 
>     // how to provide highlevel update data?
>     //result handleUpdate()
>     //
>     // mandatory 'low-level' methods, an implementation must overwrite these
>     //
>     virtual result getRRset(RRsetPtr target, const Name name,
>                             const RRClass clas, const RRType type) = 0;
                                            class
> 
>     //
>     // optional 'low-level' methods, an implementation may overwrite these,
>     // by default they return not_implemented
>     //
>     virtual void init() {};
>     virtual void close() {};

The two above have to stay in the DataSource class, of course. :)

>     virtual result addRR(Name name, int clas, int type,
>                          int ttl, Rdata::RdataPtr data)
>                         { return not_implemented; };
>     //how to specify what exact rr to delete? just pass an rrsetptr?
>     //virtual result delRR() { return not_implemented; };
>     // on name/class/type again? or use an already constructed object?
>     virtual result getRRSigs(RRsetPtr target, const RRsetPtr rrset)
>                             { return not_implemented; };
>     virtual result getNSECs(RRsetPtr target, const RRsetPtr rrset)
>                             { return not_implemented; };
> 
>     // check if the zone exists, and if so, return something that could
>     // be used as a pointer for the rest of these functions?
>     // do we need this? do we want this?
>     //virtual int getZone(void* zone, isc::dns::Name name);
> 
> };

We also need an iterator of some kind, so we can figure out the names in
the data source, right?

I am assuming fully-qualified names throughout the code - we don't have
add/remove zones, as to do that one would either add records of the
appropriate name, or remove them all.

I specified a beginTrans() without a zone... some data sources may need
locking, and want to do this per-zone. That can be implemented in a
DataSource-specific transaction class.

--
Shane