[bind10-dev] revised/refactored data source design proposal

Fri Jun 3 18:36:00 UTC 2011

At Fri, 3 Jun 2011 16:22:58 +0200,
Michal 'vorner' Vaner <michal.vaner at nic.cz> wrote:

> > Good question.  I noticed the problem of having redundant copies of
> > the in-memory case.  So far, I've not yet given much thought about in
> > this area, but some random thoughts include:
[...]
> Well, if we have the „parasitic“ design (in-memory lives on top of
> other data source), then xfrin needs only the underlying data source
> and, expecting the xfrout will happen less often and will go either
> trough the diffs in-memory doesn't have loaded at all, or trough the
> whole thing (and dumping everything is OK for databases, they are
> quite fast at that, due to buffering), I believe we don't really
> need it loaded anywhere else except auth.

I don't parse the part after "will go either t(h)rough...".

But I see if we store everything in a real database system and the
in-memory data source client only loads the data from it at run-time
for faster query processing, then xfrin probably won't have to have an
in-memory copy of the data. 

BTW, are you saying this is what we decided at the last f2f?  If so,
maybe I really didn't understand the seeming consensus correctly (I'm
not necessarily opposing to the idea itself, although it seems more
ambitious and will have more technical challenges).

> > Actually, I didn't consider transaction for reading at all in that
> > proposal, rather than intentionally excluded it.
[...]
> Well, I meant for the whole query. I'm not sure how big
> inconsistencies could happen and how big problem they could be in
> practice, maybe it's not worth the effort, but from theoretical
> point of view, there's the possibility and computers like to bite
> whenever there's the possibility ;-).

I don't oppose to efforts of ensuring full consistency.  If this is
something users really care about but only BIND 10 can provide, that
would be very nice indeed.  My only point is that we should be careful
not to do over-engineering for something that is not needed by actual
users and that only satisfies engineers' ego.

> Anyway, I got an idea, which might or might not work. You say
> there's a client for each thread. What about having a pool of these
> (assuming that the DB connections really don't care if they are used
> from multiple threads as long as they are not used from multiple
> threads at once, moving from one to another should be OK I
> guess).

If I remember correctly, SQLite3 doesn't even allow that (MySQL allows
it with the condition you mentioned), but as I will note below I think
we can find a way to work this around if it's really needed.

> When one of them would be taken out of the pool, the
> transaction could implicitly begin, after the operation terminates
> (and either the shared pointer count is reduced to zero or when it
> returns to the pool by some other mean), the transaction would
> terminate. If this would be for read-only transaction anyway, it
> wouldn't matter much which way. Any DB backend would be free to
> implement it or not.

It seems that it's effectively equivalent to b10-auth (or anything
that needs to process normal queries) starting a transaction at the
beginning of query processing and finishing ("commit"ing) it when it
gets all RRsets for the response.  The only difference would be
whether it's encapsulating in fetching a client or whether it's
explicit.  We can also encapsulate this logic in the ZoneHandle class.
Whether it's implicit or explicit, this way we can use a single
transaction for all RRsets to process the entire query.

One of my concerns in this case is if we pass a connection state
holding an open transaction to an application, it will open up a wider
possibility of starvation due to a bug in the application side.
Another concern when we use the entire "client" for the transaction,
it may cause a confusing results such as the application tries to
start a writable transaction using the same client and blocks due to
its own read transaction.

But, again, my real question is if this feature (ensuring a single
transaction and 100% consistency) is really needed by users.  If it
is, I agree it's worth pursing, and the benefit would probably
outweigh the above concerns.

> > Another note in case it was not clear from the description: the
> > ZoneIterator would require a single DB transaction (essentially it's
> > "select * from records") and implicitly uses a kind of transaction
> > (this means for a very big zone AXFR-out would probably ineffective,
> > but that's a subject of another discussion).
> 
> I don't know, they are probably effective enough in the big DBs,
> they just start sending you sending row after row and you read them
> one by one. I can't imagine anything more effective in the DB world,
> but I'm not expert there at all.

I'm not sure we're talking about the same thing...is your comment
about the "another discussion"?  If so, yes, "they just start sending
you row after row and you read them one by one".  The problem is it
may take longer time, during which the database is read-locked and
cannot be updated.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.