BIND 10 #1068: support writability in the new data source API

Thu Aug 25 00:43:02 UTC 2011

#1068: support writability in the new data source API
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:  major  |  Sprint-20110830
                  Component:  data   |            Resolution:
  source                             |             Sensitive:  0
                   Keywords:         |           Sub-Project:  DNS
            Defect Severity:  N/A    |  Estimated Difficulty:  7.0
Feature Depending on Ticket:         |           Total Hours:  0
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:7 vorner]:

 > As I probably had changes from 1183 in the diff as well, I might comment
 on something from there. But AFAIK you reviewed that one, so if it is not
 your code, please ignore it.
 >
 > So, the comments:
 >  * It doesn't compile for me, because `ASSERT_EQ` returns and is used in
 the `SQLite3Update` constructor:
 > {{{
 > sqlite3_accessor_unittest.cc: In constructor
 ‘<unnamed>::SQLite3Update::SQLite3Update()’:
 > sqlite3_accessor_unittest.cc:302:60: error: returning a value from a
 constructor
 > }}}
 >    May I suggest either throwing or using EXPECT_EQ, so people will know
 the follow-up errors are probably related to the failure? Or maybe there's
 an official solution to this?

 Good catch, and I actually chose to simply remove any xxx_EQ.  See the
 commit log.

 >  * Comment of `TTL_COLUMN = 1` is incomplete (terminating at an `a`).

 (this is for #1183, and I pointed it out there:-)

 >  * Comment of `ADD_TTL` ‒ it says it is integer. But the actual data
 type is string, this wording is probably confusing. Something like „string
 representation of integer in decimal representation“?

 Revised to "in numeric form".  I see your point, and I was actually
 revised the doc as suggested, but on looking into the documents
 closely, this was not specific to TTL; "a domain name" could also be
 confusing because it may mean a Name object or something.  Some other
 comments don't even talk about the intended type.  The documentation
 for addRecordToZone() already clarifies that these fields should be
 strings.  So something like "string representation of integer" seems
 to be both incomplete and redundant.  "in numeric form" seemed to
 better address the concern while not being too redundant.

 >  * The order of columns is different in the read and write methods (eg.
 the name is at the end in reading, at the beginning when writing).
 Wouldn't it be cleaner if they had the same order (with obvious variations
 around columns present only at one method)? They can be freely reordered
 in the SQL command, so it shouldn't be a technical problem.

 Hmm, maybe, but if we are to make the ordering consistent, I'd prefer
 placing the owner name at the first column (just as in an RR); but
 then it may be a bit unnatural for getNext() because the name is an
 "optional" field for that method - it's expected to be untouched
 (effectively "null") in case of a single lookup (via getRecords()).

 Also, the ordering may not matter much as it might look; the
 application would be expected to use the enum symbols rather than
 directly referring to specific numbers like this.

 Considering these, I'd be slightly inclined to keep this unchanged
 (and let them "inconsistent").  But if you still prefer inconsistency
 after taking into account these points and have a specific proposed
 ordering, I'm okay with adopting it.

 >  * The documentation of getRecords uses part of description for
 getAllRecords, which does not apply completely (for example it talks about
 order of RRs from different RRsets).

 (this is for #1183, and I pointed it out there:-)

 >  * To make the interface consistent, should the vectors be replaced by
 fixed-sized arrays and their sizes be checked compile-time by the
 reference trick?

 Another good point.  I actually considered that option, but didn't
 choose it because I wanted to share the doUpdate() code for both
 of the add/delete cases (they have different sizes of columns/params).
 But on thinking it further, it wouldn't be a good idea to make the
 public interface counter-intuitive due to the convenience of internal
 implementation, and, in fact I found it not so difficult to share the
 update code for both cases using C++ template.  So I made the change.

 >  * There are enums used as indices and with the last _COUNT item to mark
 the count. On one side, we have a requirement to manually number enums in
 our style guide (I think I saw it somewhere), but maybe in this and
 similar cases, we should reconsider. The standard says an enum starts at 0
 and each next item is one larger than previous, if the number is free,
 which is exactly what we want in this case (numbering by one, starting at
 0). Numbering manually is error prone, since human might just add one and
 not update all the previous ones, or something. Should we discuss it on
 the ML?

 Point taken.  Personally I have a mixture feeling about this.  On one
 hand I think your point is valid.  On the other hand, I don't like to
 depend too much on implicit assumptions even if they are guaranteed by
 the language spec - just like it's error prone to enumerate everything
 by hand (I agree on that), I think it's also the case code relying on
 implicit assumptions is error prone.  Someone who doesn't understand
 the rationale may, perhaps for a bogus reason, want to change the
 initial value of the enum to non 0:

 {{{
     enum DeleteRecordParams {
         DEL_NAME = 1, ///< The owner name of the record (a domain name)
                       ///< It begins with 1 as it seems to make more sense
         DEL_TYPE, ///< The RRType of the record (A/NS/TXT etc.)
         DEL_RDATA ///< Full text representation of the record's RDATA
     };
 }}}

 And then the code will suddenly start failing in a strange way.

 In this specific it would be easy to detect and fix it via unittests,
 but there may be more subtle cases.  We could also add a note that the
 use of the default definitions has a reason and should not be changed
 in comments, but commenting something is also an error-prone human
 involved practice.

 So, I'm not sure which one is better.  And yes, we should probably
 discuss it on the ML.

 >  * Isn't the agent for execution called „executioner“ instead of
 „executer“ (sorry, I'm not native speaker, but I did not hear the second).

 As you know I'm not a native English user either:-) I've looked up
 dictionaries and, you're right, there's no "executer".  I simply named
 it so by naive convention of adding "-er", and I have no problem with
 using a real word.  But "executioner" doesn't seem to be the right
 choice either because the only meaning (as far as I know, and
 according to the dictionaries I have) of it is "a person who has the
 job of executing criminals."

 Maybe "StatementProcessor" is a bit better?  In any case, this is just
 a private helper class only used in a single .cc, so it doesn't make
 much sense to spend too much time to try to find the best name for it.
 If you don't like any of the above, I'll simply go with
 "StatementExecutioner".

 >  * The `startUpdateZone` returns an ID, but the ID isn't used to
 anything (not a parameter to any of the updating methods). As you note, it
 might not correspond to the one from `getZone`, so it can't be used for
 reading as well. What is it's purpose? I guess current API doesn't allow
 updating multiple zones at once (even in case the underlying DB might).

 Ah, yes, I should have noted that.  It's currently unused, but will be
 needed in the latter half of this task.  It will introduce the
 "Updater Finder", which is a ZoneFinder that explores the zone being
 updated (i.e., within the update transaction).  If the underlying
 implementation uses a different zone ID for the updated zone, the
 ZoneFinder will make sure it uses the correct ID by using the one
 returned by startUpdateZone() rather than the result of getZone().

 >  * Is it guaranteed that the application will see the updated data in
 the transaction, or is it implementation detail of the SQLite one? Is it
 allowed for some backend to create a separate connection for the updates
 and have any reads out of that transaction? Do we support DBs that don't
 support transactions (eg. text files, CVS files, some tables of MySQL)?

 As commented above, the idea is to guarantee that by the use of
 "update finder", assuming the underlying database supports
 transactions.

 As for database (like) systems that don't support transactions, to be
 honest I've not fully thought about it, but realistically I suspect we
 should say the corresponding data source cannot be writable for such
 systems or we should provide our own middle layer (if possible) to
 emulate transactions (which will probably be a different derived class
 of DatasourceClient than the "standard" DatabaseClient).

-- 
Ticket URL: <http://bind10.isc.org/ticket/1068#comment:8>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development