BIND 10 #565: Boss should start/stop b10-auth and b10-recurse without restart

BIND 10 Development do-not-reply at isc.org
Wed Feb 23 02:52:45 UTC 2011


#565: Boss should start/stop b10-auth and b10-recurse without restart
-------------------------------------+-------------------------------------
                 Reporter:  vorner   |                Owner:  jinmei
                     Type:  defect   |               Status:  reviewing
                 Priority:  major    |            Milestone:  A-Team-
                Component:  Boss of  |  Sprint-20110223
  BIND                               |           Resolution:
                 Keywords:           |            Sensitive:  0
Estimated Number of Hours:  3.0      |  Add Hours to Ticket:  0
                Billable?:  1        |          Total Hours:  0
                Internal?:  0        |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 I don't see critical problems in the patch itself at this stage.  Some
 minor points are commented below.

 While looking at the code, however, some fundamental architecture
 issues arose.  The bob (or the BIND10 system in general) doesn't seem
 to be able to handle subtle failure cases due to the asynchronous
 architecture.  For example, what if we start/stop b10-auth
 successfully but fail to do it for b10-xfrout?  Or what would happen
 if we start a server (either auth or resolver) immediately after
 stopping it?  For bob, "stop" means sending a shutdown message, so it
 would be possible to try to start it again before the process actually
 dies.

 These are probably not an issue specific to this patch but rather a
 larger architectural one for the BIND10 process model, though.  To
 make it really production ready, I guess we need to carefully design
 state management of the subprocesses so that the questions like the
 above ones are clarified.

 At the current level of overall maturity of BIND 10, I'm okay with
 leaving this as a backlog.

 '''bind10.py.in'''
  - In config_handler() I'd say 'If this is...' here:
 {{{
         # This is initial update, don't do anything by it, leave it to
 startup
 }}}
   Or move the comment in the 'if not self.runnable' block.
  - I see some duplicate pattern in config_handler().  maybe we need
 refactoring.
  - In stop_process(), s/second/sendto/ ?
 {{{
 +        Stop the given process, friendly-like. The process is the name it
 has
 +        (in logs, etc), the second is the address on msgq.
 }}}
    Or do you mean the "second argument" (for the caller)?  In any case
    the variable name 'second' sounds a bit misleading to me; it looks
    more like a function/operation name.  I'd name it something like
    'recipient', 'remote', etc.

 '''bind10_test.py'''
  - StartAllProcessesBob should better be renamed.  same for the
    comment of the class.  Also, 'process' is not really the correct
    term because ccsession is not a process (although it's not specific
    to this patch).
  - In StartAllProcessesBob, we probably don't need all of the stop_xxx
    (e.g., stop_ccsession())
  - check_started_xxx's contain a lot of duplicates.  I would combine
    them into, e.g, check_started_common, where
    msgq/cfgmgr/ccsession/stats/cmdctl/(and perhaps more) are checked,
    and call it from check_started_xxx's (or from each specific test).
    We might also pass true/false to the common checker so that it can
    be used for check_preconditions() and check_started_none().
  - test_start_none(): s/Created Bob/Create BoB/?  Same for other
    similar comments (I noticed this is actually not for this patch).
 {{{
         # Created Bob and ensure initialization correct
 }}}

 '''changelog'''

 I would regard this change as 'func', although I don't oppose if you
 prefer calling it a bug.

-- 
Ticket URL: <http://bind10.isc.org/ticket/565#comment:5>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list