[kea-dev] Requirements for statistics module in Kea
Marcin Siodelski
marcin at isc.org
Tue Apr 7 09:26:25 UTC 2015
On 03.04.2015 17:20, Tomek Mrugalski wrote:
> Folks,
> One of the major features in upcoming 0.9.2 release are statistics. I
> just wrote an initial set of requirements for this piece of code:
>
> http://kea.isc.org/wiki/StatsRequirements
>
> I'd love to hear your comments. I plan to work on the design next week.
> There's no strict deadline for your feedback, but the sooner you provide
> it the better.
>
"MUST be able to store the following types: integer values (64-bits),
floating (double precision), timestamp, time. "
Although the typical use case here is to maintain the counters updated
on specific events, we should not preclude storing other types of
information like string or boolean. Some of the data which may be
gathered along with other statistical information may include
"server-state": running, reconfiguring, starting-up etc. So, if the
stats query is issued when the server is starting up and all counters
are 0, it may be useful to send a response including the "server-state"
= "starting-up" to indicate the reason for it. So, this is just an
example but I assume that there may be other types of information which
we may include in responses to statistical queries, so it is not a good
idea to narrow it down to numeric types and times.
"SHOULD be able to generate statistics over time (e.g. how much a given
statistic had changed over last 10 seconds, last 5 minutes, last hour)"
In addition to this, I wonder if it would be useful to hold the mean
value for particular counters. For example: "the average number of
packets received over last hour" or "the average pool utilization over
the past day".
"MUST be able to retrieve a value of a single statistic"
In addition to this, shouldn't there be a requirement for gathering the
statistics per-scope? For example: get statistics for the particular subnet.
"MAY be able to keep the statistics after server restart (not planned
for 0.9.2, but may appear eventually in future releases)"
I don't object that this is a requirement which may need to be deferred
post 0.9.2, but I think MAY is too weak requirement for this. Certainly,
there may be cases that your DHCP server had crashed and the watchdog
started it again. It may be useful for diagnosics purposes to persist
last gathered statistics. There is a question of how often the snapshot
of statistics should be persisted to limit the impact on the server
performance, but this is something that should be configurable. If the
statistics is not gathered frequently enough there may be a certain
period not covered by the persisted statistics but in that case the
server should also preserve the time when it was gathered and return it
to the caller.
The current set of general requirements for the servers do not mention
how the statistics should be delivered to the requestor. In particular,
the very basic requirement should be that "Kea configuration MUST allow
for selecting the communication channel for statistical queries". And
then, "Kea servers MUST be able to dump the statistical information to
the file in the specified location". I think that dumping stats to the
file is the easiest to implement and should be taken as a first step for
0.9.2, given all the time constraints. There could be other
possibilities listed too.
Specific requirements section:
I suggest to add the following requirements:
- Servers MUST report their uptime.
- Servers SHOULD report their total idle time or "percent idle" - this
is to assess how busy they are
- Servers SHOULD report their current idle time - this is to assess how
busy they were in the past few seconds.
- Servers MUST report the statistics generation time to make sure that
the requestor is able to assess if the received configuration is up to date.
- Servers SHOULD return their state, e.g. "starting-up", "running",
"reconfiguring", "shutting-down" etc.
- Servers SHOULD report the number of conflicts in address allocation
because the reserved address is in use by another client.
- Servers SHOULD return the number of unknown/unsupported messages received.
- Servers SHOULD return the number of packets received on specific
interface.
- Servers SHOULD return an average number of packets received on
specific interfaces over time.
"DHCPV4/6 MUST be able to report number of total addresses in the pool
for each subnet"
Shouldn't it be "total number o addresses"?
I also have a question about:
"DHCPV4/6 MUST be able to report number of assigned and released
addresses for each subnet"
How is this supposed to be calculated? Is this the total number of
distinct addresses currently assigned, or it includes all addresses
assigned over the period of time? I am wondering if based on this
metrics I will be able to calculate the current utilization of pools in
the subnet.
Marcin
More information about the kea-dev
mailing list