BIND 10 #2974: Implement Hooks Framework (part 1) - all but library loading and unloading
BIND 10 Development
do-not-reply at isc.org
Thu Jun 6 14:50:09 UTC 2013
#2974: Implement Hooks Framework (part 1) - all but library loading and unloading
-------------------------------------+-------------------------------------
Reporter: stephen | Owner:
Type: task | stephen
Priority: medium | Status:
Component: Unclassified | reviewing
Keywords: | Milestone:
Sensitive: 0 | Sprint-DHCP-20130606
Sub-Project: DHCP | Resolution:
Estimated Difficulty: 0 | CVSS Scoring:
Total Hours: 0 | Defect Severity: N/A
| Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Changes (by tomek):
* owner: tomek => stephen
Comment:
Reviewing all changes on brach up to commit
64bc2cfab72d67bf9e4f3cd19fd6ebb586208e67.
This is a bit of unusual review. I did review the code and then had a
discussion with Stephen on jabber. Many of the points below were
agreed on jabber and they will cause the code to be refactored. This
means that in many cases tests review is not needed at this stage as
the tests will be refactored as well.
== Alternative design proposal ==
My idea for hooks was as discussed yesterday on jabber: For each hook
point (around 25 defined for DHCPv4 and DHCPv6) there should be a list
(or any other container that can be iterated over cheaply) of
installed callouts. The hook execution code on the server-side should
look like this:
{{{
if (list for a given hook point is not empty) {
CalloutHandle handle;
handle.setArgument("packet", pkt6);
handle.setArgument(..., ...);
for (each callout on the list) {
status = execute_callout(handle);
if (status != SUCCESS) {
// log issue or skip status code here;
break;
}
}
clean up parameters in callouthandle; // if needed
}
}}}
!CalloutHandle is the only structure that is exposed to callouts. All
other structures are kept on the server-side and are never seen or
accessible on user side.
That looks like an optimal solution from the performance
perspective. Note several aspects: arguments are prepared only if
there are callouts that will use them. Arguments are prepared only
once for each hook point. Having 1000 libraries that do not install
anything does not introduce any performance penalty. Having 1000
libraries that install callouts for the same hook point slows down only
that particular hook point.
The callouts container is never exposed to the callouts, so callout
separation is maintained (one library cannot unistall another's
library callout). This design allows passing modified data between
callouts. For example you have 2 libraries: one handles client
classification and second handles logging. The first library looks at
the packet and decides to assign it to class foo (using setArgument()
or modifying passed Pkt6Ptr). The second library will be able to use
that extra argument and log it.
The code under review implements many features that are not needed in
my opinion. Here's my recommendation for the user library designer:
- if you need to store something for the whole lifetime of the
library, allocate it in load() and deallocate in unload()
- if you need to store something for packet lifetime, allocate it in
constext_create callout and deallocate in context_destroy callout.
- if you need to share something with other libraries, use
setArgument(). All callouts called after yours will have access to
that information
== Other general comments ==
Please move all code to the src/lib/hooks directory. This is an equal
feature compared to log or exceptions, so it warrants a separate
directory. Also, the hooks code being at the same directory level as
tests directory is confusing.
There is no Developer's Guide documentation. Please
add .dox file that explains the big picture. It doesn't
have to be long. This may be a separate ticket if you like.
We did discuss the hook point names registration procedure yesterday.
I'd prefer to have it as a separate ticket, but it is also ok to have it
as part of this (already large) ticket.
== Specific file comments ==
'''src/lib/util/hooks/server_hooks.h'''
!ServerHooks class description: "points at in the server" (please remove
'at').
registerHook() - this method should be renamed to avoid confusion with
the hooks registered by the user library. I would call it
registerHookType(), registerHookPoint() or registerHookName(). The
comment should also explain that this method introduces a hook type,
e.g. "PACKET_PARSED" or similar, not the callout for a given hook
(that was the first thing that came to my mind when I saw
"registerHook". It would be helpful to have an illustrative
example. If there are 10 user libraries that want to do something with
received packet, registerHook() will be called only once. In fact, I
think it would be great if the Developer's Guide could describe the
process. Please pick one simple hook, and walk through the 3 things.
First is about initialzing hook names. The second is about registering
callouts from libraries. The third one is about actual packet
processing.
ServerHooks() constructor: Do we have a naming convention for hook
names? From what you used, I assume that it is lower case with
underscores and digits. I would prefer capital letters (due to
resemblance to C/C++ constants), but I can live with lowercase.
Since !ServerHooks is created just once and is kept around until the
server is shut down, it should be a singleton. I don't like the idea
of using shared_ptr (e.g. in !LibraryHandle constructor). This is
against the philosophy of shared_ptr that is expected to a garbage
collector. No such thing is needed for !ServerHooks.
getIndex()
You are mixing exceptions and negative values for reporting errors.
I'm ok with that, but is there a reason for that?
'''src/lib/util/hooks/server_hooks.cc'''
We need to have a list of defined hooks. It should be part of the
Developer's guide documentation. In my opinion it should be done as
soon as possible ("now"), otherwise we'll start adding hooks and they
will not be documented properly. Filling in the gaps later will be
much more difficult and we'll never be sure if we covered
everything. That list should be updated every time a new hook is
added. For now, we have only two hooks, but the list will grow fast
very soon.
I would prefer to have a separate header file with hook names. That
header file should have little more than a list of static consts with
explanations. It would be useful for at least 2 purposes. First,
it would detect any misspellings at compilation time. Second, it could
be heavily extended with comments for each hook point. I plan to
leverage that when writing documentation.
For each hook, we need to specify at least the following:
- hook name (e.g. context_create)
- hook description (what does the hook do, what callout are and are
not allowed to do. This is very important. In some cases various
operation will not be allowed or at least will not make sense, e.g.
when processing options it doesn't make sense to modify incoming
packet buffer - it was already parsed and is really useless at this
stage)
- hook parameters (what parameters are passed)
- allowed hook return parameters (some hooks will allow different set
of return codes, e.g. SKIP will sometimes not be allowed)
I don't have any specific recommendation, but the format we define
here should be consistent and used in all hook types. I was thinking
about a macro around ServerHooks::registerHook() with those parameters
that would be ignored by the compiler, but could be harvested with
clever grep or a short perl script.
I don't like the CONTEXT_CREATE and CONTEXT_DESTROY names. They
are too vague. Which context you have in mind? I would rename them
to DHCP6_PKT_CONTEXT_CREATE (if you meant DHCPv6 only),
DHCP4_PKT_CONTEXT_CREATE (if you meant DHCPv4 only),
DHCP_PKT_CONTEXT_CREATE (if you want them to be shared between both
DHCPv4 families, which is not a good idea in my opinion).
Finally, this part of the code is supposed to be shared between DNS
and DHCP, so introducing DHCP at this stage is not a good idea. If
you decide to rename it to PKT_CONTEXT_CREATE (called before receiving
any packet, DHCPv4, DHCPv6 or DNS), it would be somewhat ok to keep it
here, but I have doubts if such callout would be useful as you don't
really know at this stage what type of packet you're about to get, so
you user library can't really prepare for it.
'''src/lib/util/tests/server_hooks_unittest.cc'''
DuplicateHooks test.
"duplcate" => "duplicate"
There is no test that checks that index returned for missing hook is -1.
'''src/lib/util/hooks/library_handle.h'''
LibraryHandle description: "asssociated" - please remove one 's'.
I have 2 comments about setContext(), getContext(), deleteContext()
and similar. First, what type of information can be stored here? This
data will be stored per-library, right? So that info in stored/shared
by the whole library. Why do we need it? Each library can handle this
on its own. Do you expect the server code to call it every time before
every callout to set up packet pointers, interface names, options and
other objects that should be passed? If that is so, I strongly object
to their existence. If we have 10 user libraries, such a thing should
be called once, not 10 times. And I'm not talking about a wrapper
method - I'm concerned about performance.
If this is for library developer to keep some data, then my
recommendation is to get rid of it. It is not needed. It's a bit like
saying "here's vector<int> for you. You may use it if you need to
store integers". The library implementer can store his own data on his
own. He doesn't need such mechanism.
deregisterCallout() should return bool - a status if the
deregistration was successful. Why does it take callout parameter? Do
we allow one library to install more than one hook for a given hook
point?
deregisterAll() - the description should be clear that it deregisters
all hooks from *this* library. Hooks intalled by other libraries will
not be affected. I thought that deregisterAll() would deregister all
callouts from all hook points.
getHookIndex() - why do we need this method here? Is it because you
want to allow the library to register its own hooks? If there isn't
any strong reason for it, the method should be moved to ServerHooks
class and in my understanding is that it will always return the same
response for all libraries.
If move to the design I proposed, registerCallout() and
deregisterCallout() methods should be moved to ServerHooks and be
declared static. deregisterAll(), calloutsPresent() and callCallouts()
should be removed.
More details why I don't like this current approach: In general, I
don't like the data layout. Each library has its own set of callouts
that need to be inspected during every callout and that operation will
need to be repeated many, many thousand times per second. Let me give
you some estimates. We currently are able to support 8000 leases/sec
or 16000 transactions. We had 26 hooks defined, but we know there will
be more. Some of the hooks are v4, but most are v6. Let's say there
will be 20 v6 hooks. With 16000 transactions and 20 hooks, this code
will be exercized 320000 times per second. This is non-trivial
performance hit.
If we remove the (not needed in my opinion) capabity for one library
to have multiple hooks, the whole structure will be a lot simpler. I
also think we should revisit the way LibraryHandle stores the
callouts. Right now it imposes performance penalty even when there is
no callout installed! That is not acceptable.
Here's how I imagine the callouts would work. For each hook point,
there is a list of installed callouts (from all libraries). The server
prepares data (couple calls to setArgument()) once and then calls
registered callouts by iterating over the list. This is the best
solution as far as I'm aware from performance perspective and I have 2
very strong examples to back my claim up. The first one is how
packet handlers are implemented in Linux kernel. The other example is
protocol handlers in ANVL. Although ANVL runs on many platforms
(e.g. Linux), it does not use Linux stack. It has a concept of
protocol handlers. You can install protocol handler on any layer
(Ethernet, IP, UDP, DHCP etc.) and do additional checks or
modifications.
My comments are often just suggestions. This one is not. If you force
this design, then you'll get major performance hit and once you tell
me to start improving performance, the first thing I'll do is to
rewrite it as explained above.
Comment for CalloutPtr: I understand why the return type is int (there
are no enums in C), but we need well defined return types (SUCCESS,
ERROR, SKIP, etc.). Please add them to some header file. I'm not sure
which header would be best for it. I would argue that keeping
CalloutPtr in server_hooks.h is something to consider. User library
developer
could then import just one header. But I don't insist on it - keeping
it in several different headers is fine as well. On the other hand,
we don't want the library developer to start calling ServerHooks
methods, so perhaps it is not that good idea after all.
'''src/lib/util/hooks/library_handle.cc'''
What is the reason to split checkHookIndex() and getHookIndex()? It is
inefficient to call checkHookIndex() if you need to call
getHookIndex() later anyway.
callCallouts() calls all callouts, stopping if the "skip" blag is set.
Skip flag was supposed to mean "skip the next operation to be
conducted by the server". The code uses it as "skip any other callouts
registered by this library". We never even agreed to have more than one
callout registered by the same library. And I strong object to it.
If user wants to have multiple functions called, let him handle this
in his callout function. We discussed COMPLETE status code (or a
flag), which means that no other callbacks are called after this
callback. Is it implemented anywhere?
I believe callCallouts() method has a bug in it. I thought "skip"
meant "callout did the work, server will skip the next p rocessing
step". If I understand the code correctly, currently skip means "skip
other call outs installed by this library".
If you agree with the design I proposed, this method could be moved to
ServerHooks and it would be called by the DHCP server code.
'''src/lib/util/hooks/callout_handle.h'''
NoSuchArgument comment:
"attempt is made access an argument" =>
"attempt is made to access an argument"
I have strong objection to the lifetime of the CalloutHandle object
from the performance perspective. The description says that it is
create per packet and per library. How are libraries are supposed to
pass data between them? If I understand the life span of all the
objects, they can't. This is a must-have capability.
How/when do you envision this handle to be created and later
destroyed? According to the description, CalloutHandle would have
exactly the same lifetime as Pkt4 or Pkt6 objects, but will be
multiplied by the number of libraries installed. This has serious
issues. Can you describe CalloutHandle object life-cycle?
In particular, if you have 10 libraries and 20 hook points. How many
CalloutHandle objects (including initialization) you need? 200?
If that is true, that's a perfromance killer.
CalloutHandle class description explains Arguments, per-packet context
and per-library context. If you agree with my suggestion in my
previous paragraph, most of it is no longer needed.
I think the CalloutHandle should have modified lifespan. It is per
packet, per hook point, shared among all libraries that installed a
callout for a given hook point.
I do think that ContextCollection, context_collection_ and all
associated methods that operate on it are not needed. The library can
store such information on its own if it is needed. See my comments
about life span above.
I did not review unit-tests for callout_handle and library_handle as
it is not clear at this point which part of the code will be kept
after refactoring and which will be discarded.
I'm sorry if the review is harsh, but it is essential that there are
no uncecessary performance hits here. Currently performance for
memfile is 16000 transactions/sec. There are 27 hook points planned
for DHCPv6 now, but we know that we are missing some (e.g. select
subnet). In the typical deployment, there will be at least two
libraries installed (one for client classification and second for host
reservation). We are talking about code that is being exersized at
least 864 thousand times per second. This is really performance
critical section. Every little "if" statement counts. Otherwise we
could spend a lot of time removing the bottlenecks from the code next
year.
--
Ticket URL: <http://bind10.isc.org/ticket/2974#comment:4>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list