[kea-dev] Statistics design proposal for 0.9.2
Marcin Siodelski
marcin at isc.org
Wed Apr 15 08:43:42 UTC 2015
On 14.04.2015 15:18, Tomek Mrugalski wrote:
> Hi all,
> One of the features planned for 0.9.2 are statistics. Here is the
> proposed design: http://kea.isc.org/wiki/StatsDesign.
>
> This design is not the most advanced or fully featured. It's a
> compromise between what we could do and what actually can do in the
> limited timeframe of 0.9.2 release.
>
> The basic concept is that the statistics are currently simple, but they
> can evolve over future releases. Whatever evolution path we'll choose,
> the API should remain stable, if possible.
>
> Please review and comment.
>
I would like to clarify the comment I have made at some point about the
use of concurrency when gathering the statistical information. I didn't
really mean that the statistics manager should run in a separate
process. I was rather thinking that it should run in a separate thread.
This thread could create a socket and listen on the fd belonging to this
socket. This would allow for better responsiveness of the stats manager
in the presence of many DHCP packets being received on possibly many
interfaces. This would also allow to perform certain independent tasks
like, reception of a command, unparsing JSON, creating and sending the
response concurrently with the main thread which handles DHCP stream.
There is a problem with the concurrent access to the StatsMgr such that
certain values have to be locked for write when second thread is reading
them. But, that is not something that can't be solved with the design of
the StatsMgr.
The stats manager's operation is going to be based on time intervals.
For example: keep statistics collected for the last 5 minutes. The use
of threads would probably make it much easier to use asio-based timers
which are asynchronous, i.e. based on callbacks invoked when specific
timers expire. You can't do it easily when you are hanging on the call
to select() in the main thread.
I take the point about the limited time for 0.9.2 but I am afraid we get
too much hammered to the idea of the synchronous processing even when we
could do better. I leave it up to you, but basically if I understand
correctly what Shawn said at some point, the lack of concurrency with
respect to statistics is the problem in isc-dhcp.
I wonder how statistics is going to be configured. I understand that
you're planning to add invocations to the StatsMgr in multiple places in
the code where you're going to bump the counters. But, is it going to be
possible to enable/disable specific counters so as they are not bumped
if not needed? Or, it is assumed that the counter bumping operation is
fast enough that such optimization would not bring a lot of benefit?
From the "Performance Optimization" section however it seems that it
has been of your concern.
I am iffy about the naming for statistics per subnet You say,
"subnet[0].packets-received". But, what if I remove the subnet with
index 0 from the configuration? The subnets will get renumbered and the
statistics will now apply to wrong subnet. Wouldn't it be better to
identify subnets using SubnetID which is supposed to be unique?
On the related note. Does this also account for the statistics per
interface?
In the data extraction section we should keep in mind that the
communication over the unix socket requires two sockets: one for the
client and one for the server. So I guess, you'll need to extend the
"control-socket" parameter to specify two names? I am also not so sure
that choosing the string as a parameter for control-socket configuration
is a right choice. If you want to use the same parameter for future
sockets: TCP, UDP or whatever else, it may quickly occur that you need
more parameters. If I am correct about the two names for socket files
you already have three parameters that describe the socket communication.
It would be useful if the design included some sample JSON requests and
responses, including responses which report errors in statistics
gathering. The organization of the JSON query and response should be a
subject for review because it will be troublesome to modify it once
people start implementing proprietary clients.
I also wonder if this "protocol" shouldn't be the base for the remote
management API, in which case we should take into account use cases for
the management API here? Not that I want to start implementing
management API right now, but just make sure that it will be compatible
when we implement it.
On the class diagram, I still think that it may be useful to make it
generic and allow for some additional types apart from the ones you
listed. In particular, string value. Suppose someone writes a hook and
wants to store some textual information in it like last error found.
Doesn't Observation require setValue modifiers?
Marcin
More information about the kea-dev
mailing list