[bind10-dev] revised/refactored data source design proposal

Fri Jun 3 14:22:58 UTC 2011

Hello

On Thu, Jun 02, 2011 at 11:45:11PM -0700, JINMEI Tatuya / 神明達哉 wrote:
> The length of names was my own concern, too.  On one hand, I agree
> with using shorter names, especially because I don't like having too
> long lines (longer class/variable names are major contributors to
> longer lines).  On the other hand, I also believe awkward abbreviation
> is another source of reducing readability.  This is why some of the
> proposed classes are named with "DataSource", instead of "DataSrc".
> So, the current result is an incomplete attempt of trying to find a
> good balance.  I'm quite open to suggestions on this point.

Well, I believe this is the reason why we have namespaces, so we could have
things like isc::datasrc::Client and isc::datasrc::database::Creator. It's more
readable and inside the datasrc we can just refer to it as database::Creator.

> Good question.  I noticed the problem of having redundant copies of
> the in-memory case.  So far, I've not yet given much thought about in
> this area, but some random thoughts include:
> - we are still not sure how to benefit from multiple cores in
>   b10-auth; whether to use threads or sub processes, etc.
> - if we decide to use threads, we'll probably still use separate
>   Client object for separate threads, but employ some techniques to
>   share the underlying in-memory data.
> - for longer terms, I hope we can use more sophisticated approach such
>   as a single memory image on the system shared by multiple BIND 10
>   processes (not only by auth, but also xfrin/out, etc), using
>   techniques such as shared memory or mmap.

Well, if we have the „parasitic“ design (in-memory lives on top of other data
source), then xfrin needs only the underlying data source and, expecting the
xfrout will happen less often and will go either trough the diffs in-memory
doesn't have loaded at all, or trough the whole thing (and dumping everything is
OK for databases, they are quite fast at that, due to buffering), I believe we
don't really need it loaded anywhere else except auth.

> Actually, I didn't consider transaction for reading at all in that
> proposal, rather than intentionally excluded it.  As you said, it's
> not difficult to use a single transaction for a single call to
> ZoneHandle::find(), and I don't have an objection to that.  But if you
> also wanted to have a transaction for the entire single DNS query
> (which possibly consists of multiple calls to the find() method, one
> for the answer, another for the authority, other for glue, etc), it
> may become trickier.

Well, I meant for the whole query. I'm not sure how big inconsistencies could
happen and how big problem they could be in practice, maybe it's not worth the
effort, but from theoretical point of view, there's the possibility and
computers like to bite whenever there's the possibility ;-).

> Also, for the entire DNS query, there's another possibility of having
> inconsistency: the answer may come from the hot spot cache (which
> could be older) and data for other sections may not be cached and
> require DB transaction.  What should we do in this case?

Well, I didn't care much about the hot-spot. But generally, I believe we should
either provide everything from there or nothing and that the DB should signal it
when it needs to be invalidated (of course, speaking about the optimal
scenario).

Anyway, I got an idea, which might or might not work. You say there's a client
for each thread. What about having a pool of these (assuming that the DB
connections really don't care if they are used from multiple threads as long as
they are not used from multiple threads at once, moving from one to another
should be OK I guess). When one of them would be taken out of the pool, the
transaction could implicitly begin, after the operation terminates (and either
the shared pointer count is reduced to zero or when it returns to the pool by
some other mean), the transaction would terminate. If this would be for
read-only transaction anyway, it wouldn't matter much which way. Any DB backend
would be free to implement it or not.

> Another note in case it was not clear from the description: the
> ZoneIterator would require a single DB transaction (essentially it's
> "select * from records") and implicitly uses a kind of transaction
> (this means for a very big zone AXFR-out would probably ineffective,
> but that's a subject of another discussion).

I don't know, they are probably effective enough in the big DBs, they just start
sending you sending row after row and you read them one by one. I can't imagine
anything more effective in the DB world, but I'm not expert there at all.

> - xfrin has its own DataSourceClient for the in-memory data source.
>   When it performs IXFR, it stores the diff to the separate DB
>   storage, and also updates the in-memory image (the latter would be
>   necessary to eventually dump the latest zone to a separate file).

I always thought of XFRs not having the in-memory loaded at all, working
on top of the backend DB. Only the auth would do some kind of atomic
reload/update. I didn't therefore think about this much. But if we can
atomically reload from DB, we can atomically reload from notification and XFRs
probably can afford to lock their copy completely even if we don't come up with
something more clever.

With regards

-- 
  vvvvv
 oOOOOO<  Death to the bugs!

Michal 'vorner' Vaner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <https://lists.isc.org/pipermail/bind10-dev/attachments/20110603/bde5b670/attachment.bin>