BIND 10 #2435: implement datasrc version of RRsetCollection

Thu Jan 10 22:04:26 UTC 2013

#2435: implement datasrc version of RRsetCollection
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  jinmei
            Priority:  medium        |                       Status:
           Component:  data source   |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130122
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  4             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  loadzone-ng
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by jinmei):

 '''design issues'''

 I've thought more about the tradeoffs between having the updater
 create a collector (what the latest version of the branch does) and
 constructing a collector using an updater (what the previous version
 does).

 The former has some negative points.  First, I'd generally like to
 avoid increasing the responsibility of a class (thus making it more
 monolithic) simply because we see it convenient because it will make
 the class less maintainable.  And, in this particular, I'd also be
 concerned that the updater will have two similar interfaces:
 `getFinder()` and `getRRsetCollection()`.  Such overlap seems to
 suggest some clarification is needed.

 On the other hand, the main advantage of this approach is that it
 allows each specific datasrc implementation to customize internal
 details of the collection.  When I first mentioned this approach, I
 thought it was more advantageous when supporting zone iteration, but
 on second thought I'm not so sure.  It may suffice to make it possible
 to create a zone iterator from an updater.  It would be cleaner in
 terms of the overlap issues explained above.

 There are, however, still some tricky relationships between a
 particular (DB-based in particular) data source implementation and the
 updater's `RRsetCollection`.  I suspect we won't be able to run
 multiple iterators (of a collection) at the same time, at least with
 SQLite3, and probably with other types of DB.  It'd also be impossible
 to make further updates while an iterator is running.  So I guess we
 need to impose some usage restriction, preferably as part of the
 behavior, not only by documentation.  And I guess it's easier to do
 such control if a specific data source implementation can customize
 the collection details.

 At this moment, I'm still a bit inclined to allow the per-datasrc
 customization, but I'm not very confident about it.

 I don't know how you reached the conclusion of adopting the current
 design as you didn't say anything about it in your comment, but these
 design considerations and why we finally ended up the current
 design/restrictions should be documented in detail.

 '''The derived RRsetCollection class implementation'''

 At least for now, it's not specific to the DB-based data source, and,
 would work for an in-memory updater if/when we implement it, too.  It
 may even be the case for a future version with the iteration support.
 So I'd introduce an intermediate base collection class that works as
 the default for all data source implementations.

 Also, as partly discussed above, this `RRsetCollection` will have non
 trivial restrictions, not only that it shouldn't be used after the
 originating updater is destroyed:

 - Strange things will happen if more changes are made after the
   collection is constructed, especially about the iterator.  One way
   to prevent it is to forbid further updates once `RRsetCollection` is
   created.  At least for the purpose of post-load check, this
   restriction is acceptable.
 - I suspect it won't work after commit, even if the updater is still
   alive.
 - use of multiple instances of collection may have some restrictions.
   for example, iterator can work in only one of them in the case of
   DB-based data sources; on the other hand, in-memory version probably
   doesn't have this restriction.

 I may be wrong about some of these, so please first examine these
 points carefully.  Then at least document these.  And, preferably,
 (after writing tests) the restrictions are enforced in the
 implementation (e.g, by throwing an exception in the case of invalid
 usage case).

 And, now, revisiting the point I raised in
 http://bind10.isc.org/ticket/2435#comment:12

 I guess the reason for returning a shared pointer is to allow creating
 multiple `RRsetCollection` instances.  If that's the reason, the
 question is whether we want to allow it and/or it makes sense.  As
 discussed above, it's mostly ineffective to create multiple collection
 instances for DB-based data sources because they won't be usable at
 the same time (if we assume the general case where iterators might be
 used).  On the other hand, in-memory version could be a bit more
 flexible, and there may be valid cases to create and use multiple
 collection instances at the same time.

 So we have several choices.  One simple way is to restrict the
 creation of collection only once per updater.  In that case we
 shouldn't need to return a shared pointer.  Another possibility is to
 allow specific derived classes to be flexible when they can, so the
 application can use the extended feature if it instantiates that
 particular derived version of the updater and collection.  In either
 case, the base class interface needs to have the most strict
 restrictions, though.  And all these things should be documented.

 '''database_unittest.cc'''

 - about the TODO: I think in this case it's okay.  But this affects
   the characteristic of the datasrc ('s updater) version of
   `RRsetCollection`.  That is, this collection consists of RRs of the
   corresponding zones, including "glue" RRs below a delegation zone
   cut, but not external ones like those subject to DNAME substitution
   or complete outside of the zone's name space.  So, specifically, not
   all records added through the updater may necessarily be found by
   the collection's find().  That should be documented.
 {{{#!cpp
     // TODO: "below.dname.example.org." with type A does not return the
     // record (see top of file). It needs to be checked if this is what
     // we want.
     rrset = this->collection->find(Name("below.dname.example.org"),
                                    this->qclass_, RRType::A());
     // Is this correct behavior?
     EXPECT_FALSE(rrset);
 }}}

 '''rrset_collection_base.h'''

 - The `FindError` class: I've seen mixed styles for exception classes
   mainly used with a particular class (like this): define exception
   class as the main class and define the exception outside of it.
   Previously I didn't have a strong opinion either way except that the
   style should be unified, but I'm now more inclined to the latter
   style.  A subclass like this knows too much internal detail of the
   enclosing class; it's effectively defined as a friend of the
   encloser and has all the power of the friend, referring to or
   modifying its private variables, etc.  Obviously that's too much for
   exception.  All we need is to clarify the context relationship
   between these two, and the usual tool for it is namespaces.
   (Unfortunately) we normally do not use a separate namespace per
   class basis, so we cannot take that approach, but I don't think this
   can be used to allow such intrusion of class integrity.  So, in
   conclusion, I suggest we define this exception outside of
   `RRsetCollectionBase` and name it something like
   `RRsetCollectionError`.  (We should also discuss it and eventual
   unification of the style, but that's beyond the scope of this task).

 - This doesn't seem to be sufficient.  What happens if these types is
   given?  Why can't we handle them?  What will happen in future
   versions?
 {{{#!cpp
     /// This method's implementations currently are not specified to
     /// handle \c RRTypes such as RRSIG, NSEC3, ANY, or AXFR.
 }}}

 - This needs documentation.
 {{{#!cpp
 typedef boost::shared_ptr<RRsetCollectionBase> RRsetCollectionPtr;
 }}}

-- 
Ticket URL: <https://bind10.isc.org/ticket/2435#comment:16>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development