BIND 10 #2974: Implement Hooks Framework (part 1) - all but library loading and unloading

Tue Jun 11 17:02:48 UTC 2013

#2974: Implement Hooks Framework (part 1) - all but library loading and unloading
-------------------------------------+-------------------------------------
            Reporter:  stephen       |                        Owner:  tomek
                Type:  task          |                       Status:
            Priority:  medium        |  reviewing
           Component:  Unclassified  |                    Milestone:
            Keywords:                |  Sprint-DHCP-20130620
           Sensitive:  0             |                   Resolution:
         Sub-Project:  DHCP          |                 CVSS Scoring:
Estimated Difficulty:  0             |              Defect Severity:  N/A
         Total Hours:  0             |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by tomek):

 Replying to [comment:5 stephen]:
 > Wow, where to start? I'll take this a bit at a time, but first I'd like
 to clear up some terminology:
 I'm sorry the review is so extensive, but so is the code under review.
 With such complex issues at hand, it would be better to split it into
 smaller chunks. But it is too late for that - let's continue with this as
 one ticket.

 To make this discussion a bit more readable, I removed your responses that
 I agree with or that address my previous comments.

 > * !Hook/Hook Point - used interchageably, this is a point in the code at
 which a call to user-written functions is made.  Each hook has a name and
 each hook can have any number (including 0) of user-written functions
 attached to it.
 >
 > * Callout - a user-written function.  This is so-named because the
 server "calls out" to the library to execute a user-written function.
 At first I got the impression that "hook" is overloaded and has 2
 meanings. Thanks for clarifying it. I have updated Terminology section of
 Bind10HooksFramework a bit to reflect that.

 > > The code under review implements many features that are not needed in
 my opinion. Here's my recommendation for the user library designer:
 > >
 > > * if you need to store something for the whole lifetime of the
 library, allocate it in load() and deallocate in unload()
 > >
 > > * if you need to store something for packet lifetime, allocate it in
 context_create callout and deallocate in context_destroy callout.
 > >
 > > * if you need to share something with other libraries, use
 setArgument(). All callouts called after yours will have access to that
 information
 >
 > Yes, that's the philosophy behind it.
 >
 > > Please move all code to the src/lib/hooks directory.
 >
 > I'd come to that conclusion myself when considering the hook code,
 mainly because it will need to include some logging calls.  I'll do that
 as the first task of #2980.
 Ok. Anything that makes this ticket smaller is a good thing. :-)

 > > There is no Developer's Guide documentation. Please add .dox file that
 explains the big picture. It doesn't have to be long.
 > I'll split #2982 into user-documentation (how to write a hook) and
 maintainer documentation (how the hooks framework works).  Regarding the
 former, I'll check with Jeremy as to whether this should be a .dox file or
 an Docbook guide.
 Ok. I recommend it to be part of the Developer's guide. All other things
 being equal, Doxygen wins, because it is easier to reference to other code
 pieces.

 > > registerHook() - this method should be renamed to avoid confusion with
 the hooks registered by the user library.
 > The current naming is that !ServerHooks contains "registerHook", which
 makes a hook known to the hooks system.  Libraries use "registerCallout"
 which takes as argument the name of a hook.
 Fair enough. I initially thought that registerHook is for registering
 callouts, but once I got used to the terminology, it is fine as it is now.

 > > !ServerHooks() constructor: Do we have a naming convention for hook
 names?
 > No - they are just strings.  We can name them as needed.  However, we
 may want to take into account the fact that we plan for "standard
 callouts" to have the same name as the hooks to which they will be
 attached, and that the standard for C/C++ function names is to use
 lower0case.
 That works for me. Whatever way we choose, it should be documented
 somewhere.

 > > Since !ServerHooks is created just once and is kept around until the
 server is shut down, it should be a singleton. I don't like the idea of
 using shared_ptr (e.g. in !LibraryHandle constructor). This is against the
 philosophy of shared_ptr that is expected to a garbage collector. No such
 thing is needed for !ServerHooks.
 > I'll probably do that, but I'd like to defer it to #2980 to avoid making
 any more changes than absolutely needed at the moment.
 Ok, but please update 2980 with a simple note for this and previous work.
 I don't want those things to slip through the cracks.

 > '''src/lib/util/hooks/server_hooks.cc'''[[BR]]
 > > We need to have a list of defined hooks.
 > The hooks will be different for the different modules (resolver, auth,
 dhcp4 etc.)  Regarding DHCP, I thought we did have a [wiki:DhcpHooks list
 of DHCP hooks].
 Yes, we have that, but my point was more about documentation being
 detached from the code, so it is painful to keep both of them in sync. If
 the doc was kept in *.h|cc files, it would be much easier to keep it up to
 date. But let's defer it to the documentation task.

 > > I would prefer to have a separate header file with hook names. That
 header file should have little more than a list of static consts
 withexplanations.
 > Hooks are registered by the server, so it is up to the server author how
 they do that.  The updated version of !ServerHooks offers the possibility
 of distributed registration functions.  However, if you want to register
 all the hooks up fron in one module, that's OK as well.
 Ok. I think I first wrote my point in a scratchpad, then we had a
 discussion on Jabber and then I've posted my review with this particular
 point no longer being relevant. Sorry for the confusion. I consider this
 particular point to be agreed on.

 > > It would be useful for at least 2 purposes. First,
 > > it would detect any misspellings at compilation time.
 > As hook names are strings, I'm not sure how misspellings would be
 detected.
 One could use predefined constant instead of typing the hook name every
 time. But that was just a mild suggestion and you chose to not implement
 it. That is fine.

 > > For each hook, we need to specify at least the following:
 > Yes, we need documentation, agreed.  But this documentation could start
 getting very complex and may be better in a !DocBook.  In part, there are
 three audiences for documentation:
 >
 > 1. Administrators
 > 2. User-library programmers
 > 3. BIND 10 maintainers
 >
 > The question is whether the documentation for (2) should be mixed with
 that for (3).
 My recommendation is to keep the documentation as close to the source code
 as possible, especially 2. and 3. See http://klub.com.pl/dhcpv6/doxygen/
 for example of how extensive the Doxygen based documentation can be.

 > > I don't like the CONTEXT_CREATE and CONTEXT_DESTROY names. They are
 too vague. Which context you have in mind? I would rename them to
 DHCP6_PKT_CONTEXT_CREATE ...
 > These hooks are general hooks managed by the hooks system.  The idea is
 that when a !CalloutHandle is created, CONTEXT_CREATE is automatically
 called: when the handle is destroyed, CONTEXT_DESTROY is called.  A
 !CalloutHandle is supposed to be associated with a "request" that the
 server is processing.
 >
 > (I say "request" and not "packet" because in the DNS code, a packet only
 exists when it is received, and when a packet is created before being
 sent.  In the meantime, information about the request is passed round the
 system in a different class.)
 Fair enough. I now know what the CONTEXT_CREATE and CONTEXT_DESTROY are
 for, but the next person who will look at it, will need to get through the
 same discovery pains as I did. If someone asks me: what the context_create
 and context_destroy callouts are for, I don't have any place to point him
 to.

 > > I have doubts if such callout would be useful as you don't really know
 at this stage what type of packet you're about to get, so you user library
 can't really prepare for it.
 > Possibly in DHCP, but for packets received by the DNS resolver (say),
 all packets are requests (except for those that are rejected as being
 invalid).
 Ok.

 > src/lib/util/hooks/library_handle.h

 > > If we remove the (not needed in my opinion) capabity for one library
 to have multiple hooks, the whole structure will be a lot simpler.
 > It will, but the current system allows flexibility.  As we don't know
 what user libraries will be required to do in the future, we should not
 close off options.  For example, a user library may itself dynamically
 load another library and want to install some of those functions as
 callouts.
 If library A wants to load library B, then library B should register its
 own callouts. This is a second point that we are unlikely to agree on. We
 could try to convince each other, but we could use our ideas on what the
 users will do with the hooks as arguments, so the dicussion is not very
 productive. I yield - let the code stay as it is now.

 > > Here's how I imagine the callouts would work. For each hook point,
 there is a list of installed callouts (from all libraries). The server
 prepares data (couple call  to setArgument()) once and then calls
 registered callouts by iterating over the list.
 > That's what is now done.
 Great.

 > > Comment for !CalloutPtr: I understand why the return type is int
 (there are no enums in C),
 > Actually there are enums in C.
 Ok, so I got confused. Upon further checking, enums were not defined in
 the original K&R C, but were added in ANSI C and later in C89. Taking into
 consideration your point below, it doesn't make sense to use enums,
 though.

 > > but we need well defined return types (SUCCESS, ERROR, SKIP, etc.).
 > SUCCESS and COMPLETE are now public values in !CalloutHandle.  I'll
 leave all positive values as errors - we may decide to log a returned
 error value, so don't want callouts to feel constrained as to what error
 codes they return.
 Did you mean "negative" values for errors? COMPLETE now has value 1, so
 -1, -2, -3 will signify some errors. Is that correct?

 > '''src/lib/util/hooks/callout_handle.h'''
 > > I have strong objection to the lifetime of the !CalloutHandle object
 from the performance perspective. The description says that it is create
 per packet and per library.
 > Where did it say that? (See below for an explanation.)
 That was my understanding of the text that describes CalloutHandle. It
 mentions per-packet context and per-library handle. Given the comments
 above, I think the code is ok, but the documentation needs a simple
 explanation of the lifetime for each object.

 > > How are libraries are supposed to pass data between them? If I
 understand the life span of all the objects, they can't. This is a must-
 have capability.
 > As you wrote above, libraries can pass information via arguments.
 However, if two libraries are so tightly-coupled that they pass private
 (i.e. nothing to do with the server) information between them, they
 perhaps should be one library.
 Not really. Two real life examples:
 1) library A does client classification, library B does logging
 2) library A does access control (checks against customer database, checks
 who paid their bills etc.), library B does statistics

 Those libraries can be provided by separate entities, e.g. ISC (customer
 having problems? Please install out "log-everything" debug library) and
 Comcast (we'll likely to never see their proprietary code, but we want to
 debug what their library did to the packet and what values it set).

 > There is one !CalloutHandle per request processed by a server.  When
 created, internally it creates a number of "context" objects capable of
 storing name/value pairs.  The number of these objects is equal to the
 number of libraries loaded. (Actually it is lazy creation - they are only
 created when a callout in a particular library creates packet-specific
 context.)
 Ok. I think I understand it now, but it is very confusing. Arguments are
 stored in CalloutHandle, so they are shared between libraries. Contexts
 are unique for each packet and for each library, but they are created only
 if the library calls setContext()?

 I still think the per-library context is not needed, because it duplicates
 the context_create and context_destroy hook points. We've been over this
 couple of times already, so I doubt we will reach an agreement. I'm
 reluctantly agreeing to keep the context in CalloutHandle.

 > When the !CalloutHandle is passed to a callout in the first library -
 any callout registered by that library, whatever the hook - the
 getContext/SetContext methods will set and retrieve data from the first
 context object.  When a callout from the second library is called - again,
 any callout registered by the library, whatever the hook -
 getContext/setContext will access the second context object.
 >
 > In this way each library has its own context "namespace".  The first
 library can pass per-request information between callouts by setting value
 named "foobar", secure in the knowledge that even if a second library
 (possibly written elsewhere) also uses "foobar", that use will not affect
 the first library's operation.
 >
 > > I think the !CalloutHandle should have modified lifespan. It is per
 packet, per hook point, shared among all libraries that installed a
 callout for a given hook point.
 > I strongly disagree.  We need a way to pass information between callouts
 in the same library. As I note above, if two libraries coupled enough need
 to pass information between one another, the functions should perhaps be
 in the same library. (Alternatively, they can use {set,set}Argument().)
 I think we're arguing about different things, while in fact both use cases
 are covered. Is the following true?

 {{{
 If you want to share data between callouts from different libraries, use
 set/getArgument()

 If you want to share data between different callouts from the same
 library, use set/getContext()
 }}}

 > > I do think that !ContextCollection, context_collection_ and all
 associated methods that operate on it are not needed. The library can
 store such information on its own if it is needed. See my comments about
 life span above.
 > This would make it much more complicated to pass information between
 callouts in the same library.  Yes, a library could do it - but it would
 have to store information keyed by some identification and every library
 requiring such context would have to provide its own mechanism.  That
 identification would still have to be passed from hook to hook throughout
 the server. (A packet address might not be enough - for example, a DNS
 resolver can generate multiple separate queries when processing a request.
 Added to that, unlike the DHCP server, the resolver can be processing many
 separate requests at the same time.  The DNS team have already identified
 a need to track what downstream requests are made when a given request
 appears: the hooks mechanism - with per-request context - provides a
 ready-made way of doing this.)
 As I said above, I agree that we can keep it, but I'm unconvinced that it
 is needed. Perhaps it will come to me over time.

 > Regarding the DHCP server itself, it processes one packet at a time.  If
 this approach is kept, it would be entirely possible for it to use a
 singleton !CalloutHandle and just modify it at every call. (This may mean
 adding some methods to clear out all data, but that is easily done.)
 If the DNS execution model is different, then let's keep it as it is now.

 ----

 This concludes answers to the previous discussions. The review will
 continue tomorrow. It will cover callout_manager.cc|h and all tests.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2974#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development