[kea-dev] Requirements for statistics module in Kea

Marcin Siodelski marcin at isc.org
Tue Apr 7 09:26:25 UTC 2015



On 03.04.2015 17:20, Tomek Mrugalski wrote:
> Folks,
> One of the major features in upcoming 0.9.2 release are statistics. I
> just wrote an initial set of requirements for this piece of code:
>
> http://kea.isc.org/wiki/StatsRequirements
>
> I'd love to hear your comments. I plan to work on the design next week.
> There's no strict deadline for your feedback, but the sooner you provide
> it the better.
>

"MUST be able to store the following types: integer values (64-bits), 
floating (double precision), timestamp, time. "

Although the typical use case here is to maintain the counters updated 
on specific events, we should not preclude storing other types of 
information like string or boolean. Some of the data which may be 
gathered along with other statistical information may include 
"server-state": running, reconfiguring, starting-up etc. So, if the 
stats query is issued when the server is starting up and all counters 
are 0, it may be useful to send a response including the "server-state" 
= "starting-up" to indicate the reason for it. So, this is just an 
example but I assume that there may be other types of information which 
we may include in responses to statistical queries, so it is not a good 
idea to narrow it down to numeric types and times.

"SHOULD be able to generate statistics over time (e.g. how much a given 
statistic had changed over last 10 seconds, last 5 minutes, last hour)"

In addition to this, I wonder if it would be useful to hold the mean 
value for particular counters. For example: "the average number of 
packets received over last hour" or "the average pool utilization over 
the past day".

"MUST be able to retrieve a value of a single statistic"

In addition to this, shouldn't there be a requirement for gathering the 
statistics per-scope? For example: get statistics for the particular subnet.

"MAY be able to keep the statistics after server restart (not planned 
for 0.9.2, but may appear eventually in future releases)"

I don't object that this is a requirement which may need to be deferred 
post 0.9.2, but I think MAY is too weak requirement for this. Certainly, 
there may be cases that your DHCP server had crashed and the watchdog 
started it again. It may be useful for diagnosics purposes to persist 
last gathered statistics. There is a question of how often the snapshot 
of statistics should be persisted to limit the impact on the server 
performance, but this is something that should be configurable. If the 
statistics is not gathered frequently enough there may be a certain 
period not covered by the persisted statistics but in that case the 
server should also preserve the time when it was gathered and return it 
to the caller.


The current set of general requirements for the servers do not mention 
how the statistics should be delivered to the requestor. In particular, 
the very basic requirement should be that "Kea configuration MUST allow 
for selecting the communication channel for statistical queries". And 
then, "Kea servers MUST be able to dump the statistical information to 
the file in the specified location". I think that dumping stats to the 
file is the easiest to implement and should be taken as a first step for 
0.9.2, given all the time constraints. There could be other 
possibilities listed too.


Specific requirements section:
I suggest to add the following requirements:
- Servers MUST report their uptime.
- Servers SHOULD report their total idle time or "percent idle" - this 
is to assess how busy they are
- Servers SHOULD report their current idle time - this is to assess how 
busy they were in the past few seconds.
- Servers MUST report the statistics generation time to make sure that 
the requestor is able to assess if the received configuration is up to date.
- Servers SHOULD return their state, e.g. "starting-up", "running", 
"reconfiguring", "shutting-down" etc.
- Servers SHOULD report the number of conflicts in address allocation 
because the reserved address is in use by another client.
- Servers SHOULD return the number of unknown/unsupported messages received.
- Servers SHOULD return the number of packets received on specific 
interface.
- Servers SHOULD return an average number of packets received on 
specific interfaces over time.

"DHCPV4/6 MUST be able to report number of total addresses in the pool 
for each subnet"
Shouldn't it be "total number o addresses"?

I also have a question about:
"DHCPV4/6 MUST be able to report number of assigned and released 
addresses for each subnet"

How is this supposed to be calculated? Is this the total number of 
distinct addresses currently assigned, or it includes all addresses 
assigned over the period of time? I am wondering if based on this 
metrics I will be able to calculate the current utilization of pools in 
the subnet.


Marcin


More information about the kea-dev mailing list