BIND 10 #1789: update xfrin to maintain in-memory/sqlite3 zones

Wed Apr 11 14:34:36 UTC 2012

#1789: update xfrin to maintain in-memory/sqlite3 zones
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:         |  Sprint-20120417
  medium                             |            Resolution:
                  Component:  xfrin  |             Sensitive:  0
                   Keywords:         |           Sub-Project:  DNS
            Defect Severity:  N/A    |  Estimated Difficulty:  3
Feature Depending on Ticket:  xfr    |           Total Hours:  0
  for in-memory                      |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by jelte):

 * owner:  jelte => jinmei

Comment:

 Replying to [comment:7 jinmei]:
 > '''some general points'''
 >
 > - This is rather a question about the general configuration framework,
 >   but I wonder what's the assumption on the validity of given data on
 >   the remote config.  In this specific case, can we assume the zone
 >   origin and class names are valid?  Or can we assume the data don't
 >   contain any duplicates?  In our current implementation we should be
 >   able to assume these, because b10-auth should have validated these
 >   and cfgmgr should only store post-validated data (and subsequently
 >   give it to xfrin).  But the validity isn't guaranteed from the
 >   interface (after all they are opaque strings), and the validation
 >   point (auth) is far from the user (xfrin), so the assumption seems
 >   to be a bit naive.  On the other hand, doing the same validation in
 >   the xfrin seems to be redundant in practice.  I think we need to
 >   clarify these points and document it somewhere, so we can use
 >   unified assumption throughout the system.  For the purpose of this
 >   task, I don't think we need to heavily modify the code, but add a
 >   short note about the assumption.
 >

 Yes, the whole point of the specification file to validate data was that
 once the data gets to the 'handler' routines, they won't have to worry
 about the data being bad (at least in its form and structure)

 One thing that I am considering for the api changes in config use is more
 validation (probably automatic; implicit in the initialization of the
 classes), so that it will indeed be validated each and every time the data
 is initialized in every module. Most of that may be redundant but it
 should never be on a critical performance path anyway. Especially since I
 also want to start using 'changes' somehow; instead of passing full lists,
 pass the elements that are added or removed (memory zones are a good
 example of why we need that; if you have 10000 zones you do not want to
 reload the entire list if you add or remove one ;))

 > - Slightly related the previous point, auth_config_handler() doesn't
 >   provide the strong exception guarantee, and if given config contains
 >   invalid data and triggers an exception, the system will keep running
 >   with some inconsistent state.  Based on the practical assumption,
 >   this type of exception shouldn't happen, however, (and other
 >   exceptions are probably critical enough, in which case the process
 >   should probably die) so it may not matter much in practice.
 >

 Originally, given the above (big) assumption that data is in valid form
 there shouldn't be much that could go wrong here; but your next point will
 change that, so I'm doing a validation run first, and if it passes, then
 the updates are called. If not, the whole update is ignored (and errors
 are trusted to have been reported by the other component)

 > - As for whether we should store zone name and class as string or
 >   objects, I think string is the only practical choice at least for
 >   now, because Name and RRClass are not hashable and cannot be stored in
 >   set directly.  I think we should make these basic classes hashable
 >   so we can use them for this type of purposes, and created a ticket
 >   (#1883).  Of course, that's far beyond the scope of this ticket.
 >   One issue maintaining as bare string is that they can be ambiguous
 >   due to the case-insensitiveness or variations in the textual
 >   representation (e.g. \DDD notation for names or CLASSnnn for
 >   classes).  So we could fail to match the zone if the parameters
 >   stored from the config don't exactly match the parameters given at
 >   the xfr time.  I suspect we cannot give a complete solution unless
 >   we store them as native objects or use specialized comparison (the
 >   latter will make it impossible to use set or dict), but at least for
 >   the case ambiguity we could work around it by always "normalizing"
 >   the parameters, e.g., store/compare after converting all characters
 >   to the lower case.  xfrout does this; see `_get_transfer_acl`.  We
 >   should probably do the same.

 There's also the optional root dot...

 Actually, we can convert them to Name and Type objects, and store the
 lowercased to_text() results of those. Added that.

 >
 > '''xfrin.py'''
 >
 > - why 'auth_config_handler' is "public"?  It seems to be more
 >   appropriate to define it as "protected" (with a single `_`) or even
 >   "private" (double `__`) because it doesn't expected to be called
 >   from outside of the class except as an unnamed callback.

 I kind of considered callback use as 'public', but no strong opinion,
 changed.

 > - Likewise, why `_memory_zones` and `_add_memory_zone`
 >   are "protected"?  (I see `_is_memory_zone` and `_set_memory_zones`
 >   are used in tests;  technically, though, they would even better be
 >   public in that case.  but using from a test is probably a special
 >   case, so "protected" is probably okay)

 Yes, that is why these are 'protected'

 > - `_set_memory_zones`: the `zones` list doesn't seem to be used:
 > {{{#!python
 >                     zones = []
 > }}}

 Ack, removed.

 > - `_set_memory_zones`: "filetype" is an optional item so it may not
 >   explicitly specified.  Can we assume this is safe?
 > {{{#!python
 >                         if zone["filetype"] == "sqlite3":
 > }}}

 oh, oops, thought it was mandatory.

 > - `_set_db_file`: I suspect we need to revise this comment:
 > {{{#!python
 >             # this too should be unnecessary, but currently the
 >             # 'from build' override isn't stored in the config
 >             # (and we don't have writable datasources yet)
 > }}}
 >   - 'too' doesn't make sense any more with the removal of the precedent
 comment.

 Updated

 >   - I don't understand (or remember) how the 'writable datasource' part
 is related to this code, and in any event, we now do have writable data
 source.  maybe just remove it?
 >

 oh, the point of the writable datasources comment was that in the end,
 this code shouldn't be messing with sqlite3 files in the first place; it
 should use the datasource API to update the zone data. Removed that part.

 > '''xfrin_test.py'''
 >
 > - add a test for the case where 'class' is omitted?
 > - likewise, a test case where 'filetype' is omitted?
 > - should we test other invalid cases such as invalid zone or class
 >   name?  (see the general discussion above)
 >

 With these new changes, yes :) Added.

 The test became too long, so I've put them in their own test class; the
 first one is still quite long; it is mainly intended to make sure several
 updates don't interfere, but most of the new tests have their own test
 function now.

-- 
Ticket URL: <http://bind10.isc.org/ticket/1789#comment:9>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development