BIND 10 #517: Empty node processing in MemoryZone difficult Part

Wed Jan 26 02:11:23 UTC 2011

#517: Empty node processing in MemoryZone difficult Part
-------------------------------------+-------------------------------------
                 Reporter:  hanfeng  |                Owner:  jinmei
                     Type:           |               Status:  reviewing
  enhancement                        |            Milestone:  A-Team-
                 Priority:  major    |  Sprint-20110126
                Component:  data     |           Resolution:
  source                             |            Sensitive:  0
                 Keywords:           |  Add Hours to Ticket:  0
Estimated Number of Hours:  0.0      |          Total Hours:  0
                Billable?:  1        |
                Internal?:  0        |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Review on rbtree.h.

 '''RBTree forward declaration'''
 Is this necessary?
 {{{
 template <typename T>
 class RBTree;
 }}}

 '''RBNode'''
  - why successor() is a public method?  It's not used anywhere, and
    the interface is not good from design point of view because
    it reveals an internal detail of RBTree implementation (that it's
    actually a tree of trees).  The only justifiable reason for this
    to be public that I can think of is the convenience for testing,
    but the there's no direct test for it.  On a related note, if it's
    really intended to be public, the fact that there's no direct test
    is bad, because we should generally test things as closely to the
    test target as possible.
  - documentation isn't sufficient.  please explain more details about
    "successor".  providing an example would also help.  also explain
    when to return a null node.

 '''RBTree class description'''
  - this sentence(?) doesn't parse (and s/ture/true/?):
 {{{
  * The default policy is to NOT return an empty node to end user,
  * pass ture will get empty node during find is needed.
 }}}
  - the following note is barely understandable, too:
 {{{
  *  \note The search policy only affect find behavior of rbtree, but when
  *  inserting one name, if the node already exist not matter it's insert
  *  explictly by end user or empty node created by the rbtree itself,
  *  insert will return ALREADYEXISTS not matter we want expose empty node
  *  or not.
 }}}
  - is this "todo" now implemented?
 {{{
  *  - since \c RBNode only has down pointer without up pointer, the node
 path
  *    during finding should be recorded for later use
 }}}

 '''RBTree::NodeChain'''
  - it needs more documentation, including "what it is (in more
    detail)", how it's expected to be used, design choice about why
    it's a stack, etc.  usage example will also be helpful.
  - as for the design: I'm afraid stack is not an efficient choice.  In
    addition, using a stack introduces another possibility of exception
    to find(), which is not good.
    Due to the limitations of DNS names, node chains have a fixed max
    size, and BIND 9 uses a pre-allocated fixed buffer internally held
    in the chain.  we might want to do the same.
    Furthermore, we may not even need the chain framework in the first
    place.  At least for the purpose of empty node support, what we
    really need is to identify the "next" node of a given RBTree node.
    The role of chaining for this purpose is to provide a back pointer
    when we need to go up from one (sub)tree to another.  this could
    actually be done by an explicit pointer at the root node of
    (sub)trees.  this will introduce different complexity, so may or
    may not be a good idea.  For now I'm okay with keeping the chain
    approach, but I'd like to leave these considerations in
    documentation.

 '''RBTree constructor'''
  - documentation: describe the newly introduce parameter.

 '''about findEx'''
  - First off, why do we need to introduce a new method?  My
    understanding is that this will be used only from
    MemoryZone::find() (is that correct?).  And, so will the existing
    RBTree::find() with a callback.  So, I'd simply merge these two
    cases specifically designed for MemoryZone and described it as
    a note.  As I repeatedly said, we don't need too much
    generalization; the MemoryZone class is very special, and the
    functionality it requires won't be needed in other places.
  - If there's a strong reason for defining a new method, I personally
    would avoid unclear acronyms like "Ex".  In fact, I don't know what
    it means in this context.  I'd give it a more descriptive name.
  - I'd want to see more description, including why we need this (in
    addition to the existing find() variants) and when to use it.
    I'd update the general description of the "Find methods" in this
    regard.  Note also that we may have to update existing description
    about the "no data OK" mode.
  - why is "nextNode()" a part of "Find methods"?  Is that intentional?
    If so, the general description should be updated accordingly.

 '''nextNode()'''
  - code seems to do what it claims to do, but I'm afraid it's
    unnecessarily inefficient: it involves a copy of current node chain
    and possibly with additional push/pop operations, which are also
    expensive.  In BIND 9, "node_path" is mutable and it's modified in
    the "next" function.  I think we can do the same thing, because
    node_path is quite likely to be a locally prepared structure for a
    specific search.  We could also eliminate the expensive push/pop
    operations by revisiting data structure (see above).

-- 
Ticket URL: <http://bind10.isc.org/ticket/517#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development