BIND 10 #1183: data source refactor refactor

Fri Aug 19 14:20:45 UTC 2011

#1183: data source refactor refactor
-------------------------------------+-------------------------------------
                   Reporter:  jelte  |                 Owner:  jinmei
                       Type:  task   |                Status:  reviewing
                   Priority:  major  |             Milestone:
                  Component:  data   |  Sprint-20110830
  source                             |            Resolution:
                   Keywords:         |             Sensitive:  0
            Defect Severity:  N/A    |           Sub-Project:  DNS
Feature Depending on Ticket:         |  Estimated Difficulty:  0
        Add Hours to Ticket:  0      |           Total Hours:  0
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by jelte):

 * owner:  jelte => jinmei

Comment:

 Replying to [comment:4 jinmei]:
 > First, overall I like the new version.
 >

 yay :)

 > Second, (as usual) I made some minor fixes/cleanups directly in the
 branch
 > (pushed).
 >

 thanks

 > '''database.h'''
 >
 > - This is not really about this branch, but is this note for getNext()
 true?
 > {{{
 >          * \note The order of RRs is not strictly set, but the RRs for
 single
 >          * RRset must not be interleaved with any other RRs (eg. RRsets
 must be
 >          * "together").
 > }}}
 >   At least the MockAccessor for the test doesn't seem to conform this
 >   requirement, but the test still passes.  And, in fact, we cannot
 >   always assume this especially if we allow admins to update the DB
 >   contents directly via DBMS.
 >   On looking into the implementation further, we don't seem to need
 >   this requirement: for the case of find(), addOrCreate() deals with
 >   interleaved data.  In case of full iteration, I suspect we shouldn't
 >   care: xfrout, the would-be primary user of this interface, will
 >   simply feed the result to the xfrout stream whether or not the data
 >   is interleaved.
 >

 Good points. For either use of it we have it does indeed not really
 matter. It
 should however not matter whether or not the DBMS is modified externally;
 it is
 a property that the implementation should (or should not have to) provide.
 So
 since find() actually doesn't care about this property, the only real
 reason to
 leave it in would be so that we can ensure that the 'full iterator'
 returned by
 getAllRecords() always returns complete RRsets. If we only intend to use
 this
 for xfrout (or similar functionality) we don't need it. I have no strong
 opinion, should I remove the requirement?

 > - Again, not really for this branch, but based on the "keep it dumb"
 >   principle, the 'name' argument for getRecords() should be const
 >   std::string&, not Name.
 >

 oh yeah, changed.

 > - Once again not really for this branch, but as commented the
 >   static_cast<void> trick seems to be unnecessarily dirty.  I guess we
 >   can safely move the definition to .cc where we can omit the
 >   parameters, and if it's correct, I'd suggest that.  Further, if
 >   getRecords() doesn't even work this accessor is mostly useless.
 >   Personally I'd rather simply make it pure virtual (without
 >   definition) and force the implementor to support it (or otherwise to
 >   give up supporting this type of database).  I actually also suspect
 >   the same argument applies to getAllRecords(), but that may be a
 >   different topic.

 Actually, come to think of it, I don't think we need the default
 implementation
 either; if an implementor of a new database backend wants some code to run
 before all functions are done, it can also just define the methods and
 raise
 NotImplemented there (or maybe even just DataSourceError).

 Made them pure virtuals. Perhaps the NopAccessor class is now mostly
 useless,
 though it could also serve as a nice code example of a featureless
 implementation :)

 >
 > - I'd avoid hardcoding magic numbers like "all 5 elements" or "the
 >   fifth column" (it's susceptible to future changes - consider we
 >   deprecate SIGTYPE_COLUMN in a future version, for example).  Instead
 >   I'd
 >   say, e.g.:
 > {{{
 >          * getAllRecords(), all COLUMN_COUNT elements of the array are
 overwritten.
 >          * For a 'name iterator', created with getRecords(), the
 NAME_COLUMN-th column
 >          * is untouched, since what would be added here is by
 >          * definition already known to the caller (it already passes it
 as
 >
 > }}}
 >

 ok

 > - On a related note: now that we pass the array in a securer way,
 >   maybe we should make the definition of the size more consistent with
 >   the array organization by defining it in the enum rather than via a
 >   separate const variable?  i.e., instead of
 > {{{
 > +    static const size_t COLUMN_COUNT = 5;
 > }}}
 >   do this:
 > {{{
 >     enum RecordColumns {
 >         TYPE_COLUMN = 0,    ///< The RRType of the record (A/NS/TXT
 etc.)
 > ...
 >         NAME_COLUMN = 4,    ///< The domain name of this RR
 >         COLUMN_COUNT = 5    ///< Must be the largest value of above + 1.
 >     };
 > }}}
 >

 ack, done

 > - getRecords: This doesn't seem to be correct:
 > - Likewise, the corresponding description for getAllRecords() doesn't
 >   make much sense:

 ack. Redid the docs, also added a note about how the caller should expect
 any
 exception to be thrown. This is currently not actually reflected in our
 own code
 that calls getRecords, but depending on the higher-level it should perhaps
 go
 there.

 >
 > '''sqlite3_accessor.cc'''
 >
 > There doesn't seem to be anything obviously wrong, but I have some
 > comments about the design and implementation of
 > SQLite3Database::Context.
 >
 > - Should we allow creating multiple iterator context from a single
 >   DatabaseClient at once?  At least it won't happen when we use it
 >   from find().  For zone iteration we might want to allow that, but in
 >   that case I suspect we'd rather "clone" the database connection.  If
 >   we restrict the use of the iterator context to be "singleton" per
 >   client, we can use a prepared statement, which would be a bit
 >   faster, and it may be important in the context of find().
 >

 I believe Michal had a use-case for that (though I'm not sure what it was,
 and
 whether that use-case was just the ability to do so itself). I must say
 that,
 apart from the obvious performance penalty, I do like the cleanliness of
 having
 them prepared per use, instead of per connection (and having all init and
 cleanup on the same level, as it were). Apart from that I have no strong
 opinion
 on it.

 For now I have left this as it is, but I'm open for discussion.

 > - As pointed out above, I'd suggest that Context() take "name" as
 >   std::string.  Also, we might want to store the passed string inside
 >   the class. If the std::string implementation uses refcount +
 >   copy-on-write, the overhead should be marginal, and we can then also
 >   change SQLITE_TRANSIENT in bindName to SQLITE_STATIC.
 >

 only one 'problem' with that is that is needs to store some dummy value in
 the
 case of getAllRecords()

 > - I'd do sqlite3_finalize() once getNext() results in SQLITE_DONE.
 >   Then even if an application is lazy and doesn't release the context
 >   soon we can at least release the now-unnecessary lock on the DB.
 >

 done, moved it to a private method btw, which also sets the pointer to
 NULL.

 > - The call to sqlite3_finalize() in bindName() doesn't seem to be
 >   necessary; if I miss something and it's really necessary, then I
 >   suspect we also need it in bindZoneId().
 >

 not adding it at bindZoneId() was an oversight, and it may be interpreting
 the
 documentation overly strict, but the way i read it every call to
 prepare_statement() MUST be cleared with a finalize() at some point, and
 since
 we call prepare() and bind() methods from the constructors, anything after
 prepare() should do finalize() before it raises and exception.

 > - Since convertToPlainChar() is only used from the Context class, we
 >   could make it a private method of the class.  Then we don't have to
 >   pass dbparameters to the function. (btw, as far as I can see the
 >   passed dbparameters should never be NULL so the NULL check seems to
 >   be redundant).
 >

 ack, moved

 > '''database_unittest.cc'''
 >
 > - this comment seems incomplete (missing closing parenthesis?)
 > {{{
 >             // the error handling of find() (the other on is below in
 >             // if the name is "exceptiononsearch" it'll raise an
 exception here
 > }}}
 >   (actually it doesn't parse well either)
 >

 updated the text, but actually, i am wondering if we should still have
 them, as
 these exceptions are no longer explicitely caught, making the test
 effectively
 only test whether these exceptions are not caught...

 > '''sqlite3_accessor_unittest.cc'''
 >
 > - Not really for this branch, but I think it's not really appropriate
 >   to use a zone containing only one RR(set) for the getAllRecords()
 >   test.
 >

 changed it to use example.org.sqlite, which contains 11 records.

 > - Don't we need iteratorColumnCount any more?  The fact that getNext()
 >   doesn't throw is already checked in the "iterator" test.
 >

 No, I removed all impossible tests there but forgot to check if the rest
 was
 actually useful now :) removed.

 > - On a related note: I'd avoid using magic numbers in iterator tests:
 > {{{
 >     const size_t size(5);
 >     std::string data[size];
 > ...
 >     EXPECT_EQ("example2.com.", data[4]);
 >     EXPECT_EQ("SOA", data[0]);
 >     EXPECT_EQ("master.example2.com. admin.example2.com. "
 >               "1234 3600 1800 2419200 7200", data[3]);
 >     EXPECT_EQ("3600", data[1]);
 > }}}

 Actually, initially I had a comment somewhere that this was intentional,
 but now
 that we don't use them in the actual implementation anymore we can change
 them
 here as well. Done.

-- 
Ticket URL: <http://bind10.isc.org/ticket/1183#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development