BIND 10 #2974: Implement Hooks Framework (part 1) - all but library loading and unloading
BIND 10 Development
do-not-reply at isc.org
Tue Jun 11 17:02:48 UTC 2013
#2974: Implement Hooks Framework (part 1) - all but library loading and unloading
-------------------------------------+-------------------------------------
Reporter: stephen | Owner: tomek
Type: task | Status:
Priority: medium | reviewing
Component: Unclassified | Milestone:
Keywords: | Sprint-DHCP-20130620
Sensitive: 0 | Resolution:
Sub-Project: DHCP | CVSS Scoring:
Estimated Difficulty: 0 | Defect Severity: N/A
Total Hours: 0 | Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Comment (by tomek):
Replying to [comment:5 stephen]:
> Wow, where to start? I'll take this a bit at a time, but first I'd like
to clear up some terminology:
I'm sorry the review is so extensive, but so is the code under review.
With such complex issues at hand, it would be better to split it into
smaller chunks. But it is too late for that - let's continue with this as
one ticket.
To make this discussion a bit more readable, I removed your responses that
I agree with or that address my previous comments.
> * !Hook/Hook Point - used interchageably, this is a point in the code at
which a call to user-written functions is made. Each hook has a name and
each hook can have any number (including 0) of user-written functions
attached to it.
>
> * Callout - a user-written function. This is so-named because the
server "calls out" to the library to execute a user-written function.
At first I got the impression that "hook" is overloaded and has 2
meanings. Thanks for clarifying it. I have updated Terminology section of
Bind10HooksFramework a bit to reflect that.
> > The code under review implements many features that are not needed in
my opinion. Here's my recommendation for the user library designer:
> >
> > * if you need to store something for the whole lifetime of the
library, allocate it in load() and deallocate in unload()
> >
> > * if you need to store something for packet lifetime, allocate it in
context_create callout and deallocate in context_destroy callout.
> >
> > * if you need to share something with other libraries, use
setArgument(). All callouts called after yours will have access to that
information
>
> Yes, that's the philosophy behind it.
>
> > Please move all code to the src/lib/hooks directory.
>
> I'd come to that conclusion myself when considering the hook code,
mainly because it will need to include some logging calls. I'll do that
as the first task of #2980.
Ok. Anything that makes this ticket smaller is a good thing. :-)
> > There is no Developer's Guide documentation. Please add .dox file that
explains the big picture. It doesn't have to be long.
> I'll split #2982 into user-documentation (how to write a hook) and
maintainer documentation (how the hooks framework works). Regarding the
former, I'll check with Jeremy as to whether this should be a .dox file or
an Docbook guide.
Ok. I recommend it to be part of the Developer's guide. All other things
being equal, Doxygen wins, because it is easier to reference to other code
pieces.
> > registerHook() - this method should be renamed to avoid confusion with
the hooks registered by the user library.
> The current naming is that !ServerHooks contains "registerHook", which
makes a hook known to the hooks system. Libraries use "registerCallout"
which takes as argument the name of a hook.
Fair enough. I initially thought that registerHook is for registering
callouts, but once I got used to the terminology, it is fine as it is now.
> > !ServerHooks() constructor: Do we have a naming convention for hook
names?
> No - they are just strings. We can name them as needed. However, we
may want to take into account the fact that we plan for "standard
callouts" to have the same name as the hooks to which they will be
attached, and that the standard for C/C++ function names is to use
lower0case.
That works for me. Whatever way we choose, it should be documented
somewhere.
> > Since !ServerHooks is created just once and is kept around until the
server is shut down, it should be a singleton. I don't like the idea of
using shared_ptr (e.g. in !LibraryHandle constructor). This is against the
philosophy of shared_ptr that is expected to a garbage collector. No such
thing is needed for !ServerHooks.
> I'll probably do that, but I'd like to defer it to #2980 to avoid making
any more changes than absolutely needed at the moment.
Ok, but please update 2980 with a simple note for this and previous work.
I don't want those things to slip through the cracks.
> '''src/lib/util/hooks/server_hooks.cc'''[[BR]]
> > We need to have a list of defined hooks.
> The hooks will be different for the different modules (resolver, auth,
dhcp4 etc.) Regarding DHCP, I thought we did have a [wiki:DhcpHooks list
of DHCP hooks].
Yes, we have that, but my point was more about documentation being
detached from the code, so it is painful to keep both of them in sync. If
the doc was kept in *.h|cc files, it would be much easier to keep it up to
date. But let's defer it to the documentation task.
> > I would prefer to have a separate header file with hook names. That
header file should have little more than a list of static consts
withexplanations.
> Hooks are registered by the server, so it is up to the server author how
they do that. The updated version of !ServerHooks offers the possibility
of distributed registration functions. However, if you want to register
all the hooks up fron in one module, that's OK as well.
Ok. I think I first wrote my point in a scratchpad, then we had a
discussion on Jabber and then I've posted my review with this particular
point no longer being relevant. Sorry for the confusion. I consider this
particular point to be agreed on.
> > It would be useful for at least 2 purposes. First,
> > it would detect any misspellings at compilation time.
> As hook names are strings, I'm not sure how misspellings would be
detected.
One could use predefined constant instead of typing the hook name every
time. But that was just a mild suggestion and you chose to not implement
it. That is fine.
> > For each hook, we need to specify at least the following:
> Yes, we need documentation, agreed. But this documentation could start
getting very complex and may be better in a !DocBook. In part, there are
three audiences for documentation:
>
> 1. Administrators
> 2. User-library programmers
> 3. BIND 10 maintainers
>
> The question is whether the documentation for (2) should be mixed with
that for (3).
My recommendation is to keep the documentation as close to the source code
as possible, especially 2. and 3. See http://klub.com.pl/dhcpv6/doxygen/
for example of how extensive the Doxygen based documentation can be.
> > I don't like the CONTEXT_CREATE and CONTEXT_DESTROY names. They are
too vague. Which context you have in mind? I would rename them to
DHCP6_PKT_CONTEXT_CREATE ...
> These hooks are general hooks managed by the hooks system. The idea is
that when a !CalloutHandle is created, CONTEXT_CREATE is automatically
called: when the handle is destroyed, CONTEXT_DESTROY is called. A
!CalloutHandle is supposed to be associated with a "request" that the
server is processing.
>
> (I say "request" and not "packet" because in the DNS code, a packet only
exists when it is received, and when a packet is created before being
sent. In the meantime, information about the request is passed round the
system in a different class.)
Fair enough. I now know what the CONTEXT_CREATE and CONTEXT_DESTROY are
for, but the next person who will look at it, will need to get through the
same discovery pains as I did. If someone asks me: what the context_create
and context_destroy callouts are for, I don't have any place to point him
to.
> > I have doubts if such callout would be useful as you don't really know
at this stage what type of packet you're about to get, so you user library
can't really prepare for it.
> Possibly in DHCP, but for packets received by the DNS resolver (say),
all packets are requests (except for those that are rejected as being
invalid).
Ok.
> src/lib/util/hooks/library_handle.h
> > If we remove the (not needed in my opinion) capabity for one library
to have multiple hooks, the whole structure will be a lot simpler.
> It will, but the current system allows flexibility. As we don't know
what user libraries will be required to do in the future, we should not
close off options. For example, a user library may itself dynamically
load another library and want to install some of those functions as
callouts.
If library A wants to load library B, then library B should register its
own callouts. This is a second point that we are unlikely to agree on. We
could try to convince each other, but we could use our ideas on what the
users will do with the hooks as arguments, so the dicussion is not very
productive. I yield - let the code stay as it is now.
> > Here's how I imagine the callouts would work. For each hook point,
there is a list of installed callouts (from all libraries). The server
prepares data (couple call to setArgument()) once and then calls
registered callouts by iterating over the list.
> That's what is now done.
Great.
> > Comment for !CalloutPtr: I understand why the return type is int
(there are no enums in C),
> Actually there are enums in C.
Ok, so I got confused. Upon further checking, enums were not defined in
the original K&R C, but were added in ANSI C and later in C89. Taking into
consideration your point below, it doesn't make sense to use enums,
though.
> > but we need well defined return types (SUCCESS, ERROR, SKIP, etc.).
> SUCCESS and COMPLETE are now public values in !CalloutHandle. I'll
leave all positive values as errors - we may decide to log a returned
error value, so don't want callouts to feel constrained as to what error
codes they return.
Did you mean "negative" values for errors? COMPLETE now has value 1, so
-1, -2, -3 will signify some errors. Is that correct?
> '''src/lib/util/hooks/callout_handle.h'''
> > I have strong objection to the lifetime of the !CalloutHandle object
from the performance perspective. The description says that it is create
per packet and per library.
> Where did it say that? (See below for an explanation.)
That was my understanding of the text that describes CalloutHandle. It
mentions per-packet context and per-library handle. Given the comments
above, I think the code is ok, but the documentation needs a simple
explanation of the lifetime for each object.
> > How are libraries are supposed to pass data between them? If I
understand the life span of all the objects, they can't. This is a must-
have capability.
> As you wrote above, libraries can pass information via arguments.
However, if two libraries are so tightly-coupled that they pass private
(i.e. nothing to do with the server) information between them, they
perhaps should be one library.
Not really. Two real life examples:
1) library A does client classification, library B does logging
2) library A does access control (checks against customer database, checks
who paid their bills etc.), library B does statistics
Those libraries can be provided by separate entities, e.g. ISC (customer
having problems? Please install out "log-everything" debug library) and
Comcast (we'll likely to never see their proprietary code, but we want to
debug what their library did to the packet and what values it set).
> There is one !CalloutHandle per request processed by a server. When
created, internally it creates a number of "context" objects capable of
storing name/value pairs. The number of these objects is equal to the
number of libraries loaded. (Actually it is lazy creation - they are only
created when a callout in a particular library creates packet-specific
context.)
Ok. I think I understand it now, but it is very confusing. Arguments are
stored in CalloutHandle, so they are shared between libraries. Contexts
are unique for each packet and for each library, but they are created only
if the library calls setContext()?
I still think the per-library context is not needed, because it duplicates
the context_create and context_destroy hook points. We've been over this
couple of times already, so I doubt we will reach an agreement. I'm
reluctantly agreeing to keep the context in CalloutHandle.
> When the !CalloutHandle is passed to a callout in the first library -
any callout registered by that library, whatever the hook - the
getContext/SetContext methods will set and retrieve data from the first
context object. When a callout from the second library is called - again,
any callout registered by the library, whatever the hook -
getContext/setContext will access the second context object.
>
> In this way each library has its own context "namespace". The first
library can pass per-request information between callouts by setting value
named "foobar", secure in the knowledge that even if a second library
(possibly written elsewhere) also uses "foobar", that use will not affect
the first library's operation.
>
> > I think the !CalloutHandle should have modified lifespan. It is per
packet, per hook point, shared among all libraries that installed a
callout for a given hook point.
> I strongly disagree. We need a way to pass information between callouts
in the same library. As I note above, if two libraries coupled enough need
to pass information between one another, the functions should perhaps be
in the same library. (Alternatively, they can use {set,set}Argument().)
I think we're arguing about different things, while in fact both use cases
are covered. Is the following true?
{{{
If you want to share data between callouts from different libraries, use
set/getArgument()
If you want to share data between different callouts from the same
library, use set/getContext()
}}}
> > I do think that !ContextCollection, context_collection_ and all
associated methods that operate on it are not needed. The library can
store such information on its own if it is needed. See my comments about
life span above.
> This would make it much more complicated to pass information between
callouts in the same library. Yes, a library could do it - but it would
have to store information keyed by some identification and every library
requiring such context would have to provide its own mechanism. That
identification would still have to be passed from hook to hook throughout
the server. (A packet address might not be enough - for example, a DNS
resolver can generate multiple separate queries when processing a request.
Added to that, unlike the DHCP server, the resolver can be processing many
separate requests at the same time. The DNS team have already identified
a need to track what downstream requests are made when a given request
appears: the hooks mechanism - with per-request context - provides a
ready-made way of doing this.)
As I said above, I agree that we can keep it, but I'm unconvinced that it
is needed. Perhaps it will come to me over time.
> Regarding the DHCP server itself, it processes one packet at a time. If
this approach is kept, it would be entirely possible for it to use a
singleton !CalloutHandle and just modify it at every call. (This may mean
adding some methods to clear out all data, but that is easily done.)
If the DNS execution model is different, then let's keep it as it is now.
----
This concludes answers to the previous discussions. The review will
continue tomorrow. It will cover callout_manager.cc|h and all tests.
--
Ticket URL: <http://bind10.isc.org/ticket/2974#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list