[kea-dev] Client classification design

Thu Oct 22 10:30:37 UTC 2015

On 15.10.2015 19:13, Tomek Mrugalski wrote:
> Thanks a lot for all your comments. The documents are now updated. Here
> are the major changes:
> 
> - Former Requirements document (that was lacking proper requirements
>   definition) has been renamed to ClientClassificationDiscussion
> - New Requirements document written
> - Classes are now global only. Trying to define classes on per subnet
>   basis was a mistake. Classification is (and should be) done before
>   the subnet is selected.
> - substring operator is now MUST for phase 1
> - added explicit requirement for the evaluation code to be a
>   stand-alone library with minimal set of dependencies.
> - Added a number of new expressions for phase 2 (E.7 - E.13)
> - case 1 (substring(option vendor-class-identifier, 0, 3) = "APC") is
>   now phase 1
> - added a list of expected tickets
> - updated E.1 requirement to cover vendor-class-identifier(60)
> - clarified examples
> - Added G.8 (code must be located in separate lib)
> - Added G.9 (must use its own dedicated logger)
> - Added G.10 (number of classes must not be limited.
> 
> Two (related) things that remain unsolved are:
> - which configuration syntax to use (we have a tie, one vote for
> proposal 1 and one for proposal 2)
> - what order to use
> 
> Currently we do:
> 
> 1. classify packet
> 2. select subnet (we use class information here)
> 
> We need 2. to for cable networks (separate modems and other devices into
> distinct subnets). If we stick with it, we won't be able to define per
> subnet classes (or at least not use them to select subnets). This would
> also force us to use proposal 1.
> 
> The list of expected tickets just for phase 1 is whopping 16. Even
> ridiculously underestimating the tasks as 2 days/task (good luck with
> implementing Bison grammar in two days), this gives us 32 days (or more
> than 6 weeks) of engineering time. I leave it up to you to decide
> whether it's realistic to shove this into 1.0.
> 

Requirements

"Client classification can be a very complex feature .... " is repeated
twice in the document. I recommend that it is left in Intro and Scope
section only.

Also, this paragraph suggests that we're going to limit ourselves to
equality operator, while we actually think the substring operation will
also be required.

"Our current (already written) implementation... "
Suggest adding a timestamp because as the time goes by this will refer
to the past situation.

"Design assumptions"
This should be renamed to "Requirements List" or similar.

The NOTE should be removed, because these items have already been moved
to Requirements doc.

G.2. should be split to two requirements: one talking about the
comparison, and another one about the substring operation. As it stands
it is easy to miss the requirement for substring operation.

G.3 - It would be clearer to say "DHCP Options returned to the client
are the combination of global options, subnet specific options and those
specific to the classes that the client belongs to"

I also think for G.3 that it should say something about class-specific
options taking precedence over global and subnet specific options.

G.5 - This requirement is unclear as it stands. In particular
"expression" is ambiguous. If it was explained well enough, the second
part of this require wouldn't be needed. Maybe add something that
expression in general  consists of several conditions with bool logic
applied in-between. Or something like that.

G.7. - I don't understand why the requirement is repeated in parens and
what "can apply" means.

G.8 - It is going to be hard to measure "minimal dependencies". I
recommend you simply remove the part about minimal dependencies. I guess
nobody would add dependencies that are not required by the library
implementation.

G.10. - I suggest rewriting it to say that "The administrator must be
able to define arbitrary number of classes". I also suggest removing the
part about issuing warning etc. I don't think we should even consider
this warning at this point, because it is hard to determine upfront what
number of classes is already worth a warning. I'd leave it up to the
administrator of a system.

E.1, E.2. - I suggest that you expand on "extracting option values". Is
it to say: "Extracting full option payload as an input for expression"?
As opposed to extracting individual option fields?

E.3. - I think this requirement could be clearer as to how the
expression would look like for the option with multiple fields, would
this be comma separated format or something else. This may be considered
implementation detail really but I rise this question because you used
example of "MSFT" string which is normally used in the substring
operation context. Simply don't know if this requirement equally applies
to the case of equality operator.

E.5. I presume the equality operator is supported in the sense that you
can compare the payload of the whole option to the string or hexstring.
Correct?

I am confused with the distinction between the "General" and
"Functionality" requirements. Although, it is not a big deal because the
essence is in the requirements themselves but grouping into distinct
sets may make requirements clearer or may make them actually harder to read.

For example, aren't requirements G.3, G.5 and G.10 really functional
requirements here?

I don't see anything mentioned explicitly about sub-options here. Do we
need additional requirements to cover the classification based on
sub-options? This is especially interesting in the context of comparing
the whole option with suboptions against something for equality.

I see further on that you're thinking about using the output of toText()
function. But there are some issues with this output being compared to a
fixed string, which I mention below. There is also an issue that if
someone creates a configuration file which would use the equality
operator, the future change in the toString()/toText() implementation
would make his configuration incompatible.

A couple of comments about the Design document follow.

"Configuration"
There are three proposals but the design doesn't include the final
decision which of them will be implemented. I also think that the
examples should reflect the fact that we're planning to start with
implementation of global classes, rather than classes nested within the
subnet.

"Parsing"

This paragraph is not true anymore:
"Since the only type of expressions supported will be value1 == value2,
we could simply parse everything before == and treat it as one
expression, the parse the rest of the string after == as a second
expression."

I am not sure I understand that:
"This list is not complete. We will likely need a dedicated class for
extracting information from Vendor-Identifying Vendor Class (124) and
Vendor-Identifying Vendor-Specific Information (125) options."

What list does it refer to? Also, why would you need additional class
for those options? Aren't you planning to introduce a virtual toString()
function which will be implemented within existing class hierarchy?

"Interaction with Option classes"
I am trying to understand the code that will be introduced into Option
class and derived classes. So, you're planning to evaluate using the
string value carried within the option, e.g. "MSFT". In the requirements
you also mentioned that evaluation using the hex representation is
possible. So, technically you're going to implement the toString()
function within Option class, returning the string OR hex string? Would
this be controlled using some sort of parameter passed to toString()? I
presume it would be something like toString(const bool use_hex) ?

"Expected Tickets"

About task #4. After implementing this ticket all instances of the
classes derived from Option would return toText()? So this change would
simply be to implement the following function in the Option class?

std::string Option::toString() const {
    return (toText());
}

Note that some specialized options would return data in different
formats and they also include indentation and new line characters.
Comparing such output of toText() for equality wouldn't probably do any
good. It should work fine for substring, though.

Another related question. How this will satisfy the requirement to
compare option against hex string?

About #5. Are we actually going to use any Yacc or Bison for 1.0?

About #15. I don't have such a strong opinion about the update to
Developer's Guide. I agree we should aim for this but the world wouldn't
collapse if we don't address it for 1.0. The world will collapse if we
don't update the User Guide properly.

Marcin