BIND 10 #1264: Design document for DHCP benchmarking utility
BIND 10 Development
do-not-reply at isc.org
Fri Oct 21 10:51:43 UTC 2011
#1264: Design document for DHCP benchmarking utility
-------------------------------------+-------------------------------------
Reporter: | Owner: johnd
stephen | Status: reviewing
Type: task | Milestone: Sprint-
Priority: major | DHCP-20111026
Component: dhcp | Resolution:
Keywords: | Sensitive: 0
Defect Severity: N/A | Sub-Project: DHCP
Feature Depending on Ticket: | Estimated Difficulty: 0
Add Hours to Ticket: 0 | Total Hours: 0
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by stephen):
* owner: stephen => johnd
Comment:
Additional comments can be found in [ticket:1263#comment:2 comment 2] and
[ticket:1263#comment:3 comment 3] of #1263.
Replying to [ticket:1263#comment:3 comment 3 johnd]
> I'm thinking, for the initial version, from these suggestions adding:
> :
> * Explicit description of what the exit status means. Perhaps just
status 1 if any dropped packets, unless other pass criteria options are
added.
For now, I think that would be sufficient.
> Should any of the other suggested options be in the initial version?
See below.
Replying to [comment:5 johnd]
> once free-formatted output is settled.
My initial thought was date/time but given the difficulty of parsing
various formats, I think that for each event, a simple time (in seconds)
since the start of the test expressed as a floating-point number would be
sufficient. So the output format would be:
{{{
perfdhcp version
Command line options
Date/time test started
send_time,receive_time[,send_time_packet_2,receive_time_packet_2]
send_time,receive_time[,send_time_packet_2,receive_time_packet_2]
:
}}}
(with the date in yyyy-mm-dd format to avoid confusion between dd/mm/yy
and mm/dd/yy.)
The last two columns are optional, being absent if only the initial packet
exchange is measured. And if a packet is lost, put -1.0 in the receive
field (and in the other two fields if a full 4-way packet exchange is
being measured).
I've included the version and other data in the output as the first few
lines; Tomek's point about missing information when reproducing a problem
is very pertinent.
Replying to [ticket:1263#comment:2 tomek]
> I have couple of comments.
Hmm... this is a definition of "couple" of which I was previously unaware
:-)
> In DHCPv6 we need 3 transmission modes:
That sounds reasonable. Re-reading the perfdhcp command line, the use of
the command-line argument is inconsistent: for IPv4, it is the address
''to'' which packets are sent, in IPv6 it is the interface ''from'' which
packets are sent.
I suggest that the interface be specified with the -l option. This
already sets the local hostname/address for a IPv4 packet exchange - it
could specify the local interface for an IPv6 exchange. The target to
which packets are should be the argument to the command line. To simplify
things for V6 use, as well as allowing an IPv6 address, the program should
also recognise the strings "all" (for All_DHCP_Relay_Agents_and_Servers)
and "servers" (for All_DHCP_Servers)
> There is also rapid-commit option that, when supported by both server
and client, will cause SOLICIT to be answered immediately with REPLY. That
is not needed in first version, but it is something that we should plan to
implement later.
Agreed. Ticket #1334 has been raised for it and put on the general
backlog.
> Regarding the -r option, it is useful, but it is not enough.
> :
> To meet those usages, -time (or -t) option should be added that
specifies duration...
Agreed. Since r * t = n, the command parser should accept any two and
calculate the third (objecting if all three are given). I would suggest
that the default for r be something like 10/second; if neither t nor n is
specified, assume a value of n equal to a 2^32^ - 1, i.e. essentially
unlimited.
(As an aside, allowing very large values for n complicates the mapping of
packet ID to information about the exchange as a simple pre-allocated
array cannot be used. However some form of double-buffer - where a buffer
can be reused once a the time equal to (time last packet using this buffer
was sent + packet drop time) has passed - should work.)
> Other things that we should consider at a later date is turning this
into stress testing. Let's call it --torture or similar. It starts sending
data at some rate and increses it slowly until server starts dropping.
That is the maximum rate the server can handle.
Agreed. Ticket #1335 has been raised for it and put on the general
backlog.
> There should be option to conclude (fail) the test if there is a single
drop. We don't want to wait 12 hours to see that 5 seconds after test
started something broke. Not sure how to implement this in the most
convenient way. Maybe --drop-threshold that specified acceptable amount of
dropped traffic? It seems useful to have it specified in both percentage
and absolute numbers.
I suggest that it be specified as simple packets for now with something
like "-t<lost-packets>". If not specified, there is no limit to the
number of dropped packets.
> Besides of using dhcperf as manual tool, it will also be used as
automated test. In that case it should have clearly state if specified
pass criteria are met or not. Something that could be easily parsed by
automated environments.
>
> Make sure that return code will specify status.
This sounds useful, although I'm not clear what you mean here. In any
case, I think it is something that can be added later. Could you raise a
ticket for it?
for now, as suggested above, the return code should be 1 if any packets
were dropped.
> For automated test tools it is very convenient to print out command-line
parameters. That's a practical experience. I received many logs that were
useless because it was not possible to reproduce the problem due to
missing information about used parameters.
See above when the "-o" option is specified. Do you think there is a need
to echo them if used interactively, at a terminal?
> There is no --version parameter. Tool should also print out its version
when started. See above comment about reproduction concerns.
Agreed. I would make the "-v" switch do this.
As to verbose option, I suggest this be merged with the debug option. In
a small program such as this it is probably more different areas of the
code you want debugging information for than different levels of debug
information. If, instead of a "debug-level", the argument to the -x
switch were a "debug-mask", different pieces of debug information could be
output by setting bits in the mask value. Displaying the packet contents
and communication would then require using -x with a value that has the
appropriate bits set.
> Another feature that could make this tool much more powerful is the
ability to specify additional options. While it would be great to have
custom option definition framework, for now we can do something much
simpler. A command-line option that specifies extra payload that is
appended to the message. A proper warning "it is user's responsibility to
take care a proper format". For example, to specify that I want to send
option type 100 with length 2 containg 0xabcd, I could do: --extra-data
00:64:00:02:ab:cd. This seems simple enough (parse command-line + a single
memcpy will do the trick)
>
> Another useful thing would to be to specify which options client should
request. That is also not too difficult. This is just adding 8 bit(v4) or
16bit (v6) integers to PRL or ORO, respectively. Usage could be simple:
--option 45 --option 5.
Added #1336 to the backlog.
> It would be useful to elaborate on reply verification. V4 server
responding with NACK is ok or not? What about v6 server sending REPLY with
status-code=no-addrs-avail? That is another thing we could eventually add
as a feature. In some scenarios negative response as considered a proper
one (test passed) and in others it is not (test failed). Make sure that
the verification could be tuneable. For now it can be simple, but it will
be more complex later.
Added #1337 to the backlog.
The options suggested for immediate inclusion do not seem too much work,
and tickets have been raised for the more complicated stuff. I suggest
the design document is updated with these suggestions then we close the
ticket and start work on implementation. Given our time constraints for
the work, we need to do this ASAP.
--
Ticket URL: <http://bind10.isc.org/ticket/1264#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list