BIND 10 #2974: Implement Hooks Framework (part 1) - all but library loading and unloading

Wed Jun 12 20:21:43 UTC 2013

#2974: Implement Hooks Framework (part 1) - all but library loading and unloading
-------------------------------------+-------------------------------------
            Reporter:  stephen       |                        Owner:  tomek
                Type:  task          |                       Status:
            Priority:  medium        |  reviewing
           Component:  Unclassified  |                    Milestone:
            Keywords:                |  Sprint-DHCP-20130620
           Sensitive:  0             |                   Resolution:
         Sub-Project:  DHCP          |                 CVSS Scoring:
Estimated Difficulty:  0             |              Defect Severity:  N/A
         Total Hours:  0             |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by stephen):

 I noticed you had partially completed the review, and as most of the
 points don't require a code modification, decided to respond to them now.

 From comment:6

 > Ok, but please update 2980 with a simple note for this and previous
 work. I don't want those things to slip through the cracks.
 (About making !ServerHooks a singleton) Done.

 > Fair enough. I now know what the CONTEXT_CREATE and CONTEXT_DESTROY are
 for, but the next person who will look at it, will need to get through the
 same discovery pains as I did. If someone asks me: what the context_create
 and context_destroy callouts are for, I don't have any place to point him
 to.
 We could call them something like "initialize" and "rundown".  The point
 is that the "initialize" callout gets control before any of the "real"
 hooks are called for a packet/request, and the "rundown" gets control
 after all have been called. In both cases, this is regardless of what
 hooks are actually called.

 > Did you mean "negative" values for errors? COMPLETE now has value 1.
 Whoops!.  Now modified to have a value of -1.

 > Not really. Two real life examples:
 > 1) library A does client classification, library B does logging
 > 2) library A does access control (checks against customer database,
 checks who paid their bills etc.), library B does statistic

 Taking (1) as an example, the point I was trying to make is that if the
 server doesn't provide classification information, how does library B know
 where to look for the information provided by library A?

 In any case, they can do it through the argument list.  We may want to set
 a convention that BIND 10 guarantees that no argument name will start with
 a special character (e.g. a "$"), leaving a namespace that inter-library
 communication can use.

 > I think we're arguing about different things, while in fact both use
 cases are covered. Is the following true?...
 Yes.

 From comment:7

 > The design does not allow for extra libraries to be loaded at runtime.
 This is something we need to be ready for. You start a server, then
 configure it to use one library, then configure it to use another one. You
 can't assume that the number of libraries is constant.
 That is handled via reconfiguration (not yet written).  The intention is
 for the configuration to specify an array of libraries: these are read and
 the libraries loaded one at a time.  On a reconfiguration, libraries are
 unloaded and reloaded.

 > Who is intended user of the public methods in !CallOutManager: server-
 side code or user library? I assume it is server code, right? Again, I
 have trouble understanding how the intended usage model looks like.
 It's described in the [wiki:Bind10HooksFramework design document].
 !CalloutManager is a server-side construct.  !LibraryHandle - which is a
 restricted interface into !CalloutManager - is made available to callouts.

 > Who is going to call setLibraryIndex()? I'm concerned with the fact that
 the registerCallout() may be called by the user library, right
 As described in the [wiki:Bind10HooksFramework design], the library index
 is set by the server-side code when calling the callouts - each callout
 has the library index associated with it.  However, because only the
 !LibraryHandle is made available to the callout, the callout cannot see or
 access that index.

 LibraryHandle::registerCallout() can be called by user-written code in one
 of two places:

 1. in the library's load() function (if registering non-standard
 callouts).  In this case, the index is set by the library loading code.
 1. in a callout.  User callouts are called by
 CalloutManager::callCallouts(), and this sets the index before calling a
 user callout.

 > The functions in .h file are getting bigger and bigger.
 They're not that big - typically where the code is little more than an
 "if" statement.  If you really want, I can move them to the .cc file.

 > CalloutManager::registerCallout() Isn't the
 checkLibraryIndex(current_library_) check redundant? It was already
 checked when setting up current_library_.
 As noted, it's a sanity check to guard against a programming error.  The
 idea is to get a meaningful message if (for example), this call is made
 after all libraries have been removed.

 > Why do you require to have the callouts from the same library be grouped
 together? Michal gave specific example when he wants them to be out of
 order: the encryption + signing
 The order of callouts is the order that the libraries are specified in the
 configuration.  The extra check is for dynamic addition of callouts - if
 the order is specified as library A then library B, and library A
 dynamically adds a callout on a hook, we guarantee that it will be called
 before any callouts registered by library B for that hook.

 Taking Michal's example, if both encryption and signing are in the same
 library and two callouts are to be registered for a hook, the library's
 load() function can register them in the order it requires.  If they are
 in separate libraries, they are run in the order the libraries are
 specified in the configuration.

 > There's 27 hook points for DHCPv6, multiplied by 16000 (current number
 of transactions per second) gives you 432 thousand copies per second.
 I think you're being overly dramatic.  First off, I doubt if we ever will
 have all 27 hooks with callouts attached to them: if you are checking
 whether there is a callout present before calling it (to avoid overhead
 attached to adding argument information to the !ContextHandle), the copy
 will never be called in those cases.  Secondly, does every packet pass
 through all 27 hook points?  Finally, it's a copy of a few bytes (8 bytes
 * number of callouts on a 64-bit system); that overhead is likely to be
 dwarfed by the code involved any callout called (and the more callouts
 that are called on a hook, the more that overhead is amortised between
 then).

 I suggest we leave it until we do a performance measurement. We'll load a
 library with null callouts on every hook point and measure the overhead
 with the copy in and with it out.  If there is a significant difference,
 I'll remove it.

 > deregisterCallout() - that's a tricky removal code. I think I understand
 how it works, but isn't there a simpler way to remove something from a
 vector?
 I can't think of one other than effectively doing the same:
 1. Working through the the vector comparing elements against my test
 element (the equals_to())
 1. Moving non-matching elements to lower vector indexes (the remove_if())
 1. Truncating the vector (erase()).
 I could use a std::list instead, but I don't think it's worth it for the
 simplification of one function - especially as this particular construct
 is documented in a book for which the reference is supplied.

 '''src/lib/util/tests/callout_manager_unittest.cc'''[[BR]]
 There are some valid points here that need code modification.  These will
 be addressed when all reviews have ended.  However...

 > !CalloutManagerTest.!CallCalloutsError: I'm somewhat uneasy with the
 fact that the first library that returns an error will cause other
 libraries to not be called. What if the first library is logger and it
 reports error, because partition for logs is full? Then the code will stop
 calling libraries for access control, host reservations etc. That would
 essentially disable hooks and the server would continue to run in its
 "vanilla" form. This would make a very fragile system, with easily
 triggered domino effect. The first library that falls over will trip all
 others as well.
 That is a pertinent observation and I'm in two minds here.

 On the one hand, you are correct with your example in that a failing
 callout could effectively disable the hooks system.  On the other, perhaps
 a callout started modifying the data passed to it, failed and left the
 data in an inconsistent state?  That inconsistency could cause a crash in
 another callout.

 Allied to this is the question of whether we should have a COMPLETE status
 at all? The original idea was to allow a callout to have the last word in
 whether processing was complete.  But a poorly-written callout - for
 example, returning an unitialised status - could cause random skips.

 Also, it struck me that the callout manager needs some form of logging
 precisely to record what library (if any) returns an error.  That will be
 added in the next ticket (which adds code to load the libraries); the
 reason it is being delayed is because any error message should include the
 name of the library, not the index.

 Perhaps the easiest way forwards is to get rid of COMPLETE and to log
 errors returned by callouts but otherwise ignore them.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2974#comment:8>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development