[kea-dev] Some thoughts about client classification

Tue Sep 22 09:12:58 UTC 2015

On 21.09.2015 21:08, Stephen Morris wrote:
> At the call today, I said that I would kick off the discussion on client
> classification.  Regard this as something from which to generate ideas
> rather than a done and dusted proposal.  In particular, it is
> deliberately simple - the simpler it is, the earlier we are likely to
> implement it.
> 
> 
> Section 7.2.13 in the Kea manual:
> 
> http://kea.isc.org/docs/kea-guide.html#dhcp4-client-classifier
> 
> ... talks about the simple classification that we already have in the
> DHCPv4 and illustrates how to use it to limit access to an IPv4 subnet.
> 
> In essence, Kea looks for the vendor class identifier option and, if
> present, associated the packet with a class named VENDOR_CLASS_xxx,
> where "xxx" is the contents of the vendor class identifier option.
> 
> Once the class has been created, Kea can restrict the subnet from which
> the address is drawn by including a
> 
> "client-class": "VENDOR_CLASS_xxx"
> 
> ... statement in the "subnet" clause.
> 
> The two elements of that scheme are:
> 
> 1. Creation of a client class based on some criteria.
> 2. Use of the class to allow/restrict the action of Kea.
> 
> Any scheme we have will need to do both of these things, and a possible
> way to express this in a Kea configuration is described below.
> 
> 
> Creation of client class
> ---
> Client classes are defined in a clause within the "Dhcp4" or "Dhcp6"
> blocks, and contain the name of the class and a single "test" clause,
> that defines the criteria for inclusion in that class, e.g.
> 
> "client-classes": [
>     {
>         "name": "class-name",
> 	"test": [ selector, op, value ]
>     },
>     {
>         "name": "class-name",
>         "test": [ selector, op, value ]
>     }...
> ]
> 
> The "test" component is an array giving the name of the selector in the
> packet, a comparison operation, and a comparison value.  To keep things
> simple, everything is done as string comparisons. This means that for a
> first pass:
> 
> * The values of integers are converted to base 10 string format before
> being checked.
> * IP addresses are converted to presentation format.
> * Boolean values are converted to "true" or "false".
> * Binary data is converted to a hexadecimal string.
> 
> "selector" is either the name of a field in the packet or the name of an
> option carried by the packet.  If an option is an array, then the
> elements of the array are concatenated together (possibly separated by a
> separator such as "/" or ";") before being used in the comparison. If a
> substring is required, that is included in the name of the selector by
> suffixing it with "[start-offset:end-offset]".
> 
> "op" is one of: "lt", "le", "eq", "ne", "ge", "gt" or "match".  The
> first six are self-explanatory and compare the string derived from the
> option with the value.  "match" checks if the option string matches the
> regular expression.
> 
> "value" is the value to compare against.
>

For comparisons involving a single option field compared against a
specific value (e.g. integer) I don't see why we couldn't try to convert
the value specified in the definition of the class to the type held in
the option. For example, if the specific option is an array of unsigned
integers, this option is represented by the OptionInt class. This class
holds the array of integers internally. This class should expose
comparison method(s) which would accept the index of the field within
the option to compare to, and the value defined in the client class as a
string. So for example:

OptionInt<uint32_t>::compare(unsigned int field_index, const
std::string& op_type, const std::string& value);

This function would know which type the given field (specified by the
field_index) has and would lexical_cast value to this type for comparison.

For the match operation it would probably do it on strings as you
propose, but for others it makes more sense to convert value to the
specific type, rather than the other way around because if someone types
"01" instead of "1" it wouldn't pass the test if compared on strings.

I'd also think that for Kea 1.0 we could restrict the number of
supported operations to "eq" and "match". We could also consider "ne".

> Kea searches the "client-classes" array testing the packet against each
> element of the array in order.  The first clause that matches sets the
> class of the packet to the name given. If no clause matches, the packet
> is unclassified (i.e. its class name is the empty string "").
> 
> Examples:
> 
> "test": ["chaddr", "eq", "08002b02deadbeef"]
> ... matches if the hardware address (in the "chaddr" field) is that
> specified.
> 
> "test": ["chaddr[0:8]", "eq", "08002b02"]
> As above but matching all clients if their hardware address starts with
> the string specified.
> 

One has to note that chaddr is not an option but the field in the
packet, so it would require some special code paths. I think that in the
first step we don't require classification based on the contents of the
chaddr because this is what host reservation is intended to do, with its
own semantics.

> "test": ["vendor-class-identifier", "match", "foo"]
> Matches if the vendor class identifier option contains the string "foo"
> somewhere in it.
> 
> "test": ["vendor-class-identifier", "match", "bar$"]
> Matches if the vendor class identifier ends with the letters "bar".
> 
> 

I think we also need some example for comparison of some specific option
fields, which seems to be quite frequent use case. So, rather than
matching the whole option we may sometimes do:

"test": ["vendor-class-identifier[2], "eq", "docsis3.0"]

where [2] is an index of the option field. Note that we don't track
names of the fields.

One also has to note that the options have suboptions. These options
have to be referenced somehow. Maybe we don't need to do it for 1.0 but
some form of encapsulation notation would be needed.

> 
> Use of class information
> ---
> The current classification scheme requires that for a configuration
> block to be used, it must either contain no class name or, if it does,
> the name must match the class associated with the packet.
> 
> The most frequent use of client classification would be to return a
> specific address or set of options to the client depending on its class.
> 
> The Kea manual already illustrates how a specific address can be
> returned (by incorporating the "client-class" keyword in a "subnet"
> clause (section 7.2.13.1). As a subnet within a subnet{4,6} clause can
> already contain options that override the global data, maybe the
> quickets way to allow the choice of options based on class is to permit
> overlapping subnets and pools within the subnet clauses.  This is
> illustrated in the fuller example, below:
> 
> 
> "Dhcp4": {
>     "client-classes": [
>         {
>             "name": "foo",
>             "test": ["chaddr[0:8]", "eq", "08002b02"]
>         },
>         {
> # If we allow a class name to appear multiple times, we
> # can implement a simple "OR" operation with no extra
> # effort.
>             "name": "foo",
>             "test": ["chaddr[0:8]", "eq", "00de3322"]
>         },
>         {
>             "name": "bar",
>             "test": ["vendor-class-identifier", "match", "docsis"]
>         },
>     ],
>     "subnet4": [
>         {
>             "subnet": "192.0.2.0/24",
>             "pools": [{ "pool": "192.0.2.10 - 192.0.2.20" }],
>             "client-class": "foo",
>             "option-data": [
>                 {
>                     "name": "domain-name-servers",
>                     "code": 6,
>                     "space": "dhcp4",
>                     "csv-format": true,
>                     "data": "192.0.2.3"
>                 },
>             ],
>         },
>         {
> # Same subnet and pool as above, but with a different value for
> # the option.
>             "subnet": "192.0.2.0/24",
>             "pools": [{ "pool": "192.0.2.10 - 192.0.2.20" }],
>             "client-class": "bar",
>             "option-data": [
>                 {
>                     "name": "domain-name-servers",
>                     "code": 6,
>                     "space": "dhcp4",
>                     "csv-format": true,
>                     "data": "10.0.0.13"
>                 }
>             ]
>         },
>         {
> # No client-class clause, so this will be used by all clients
> # that failed to fall into one of the defined classes.
>             "subnet": "192.1.2.0/24",
>             "pools": [{ "pool": "192.1.2.1 - 192.1.2.254" }],
>             "option-data": [
>                 {
>                     "name": "domain-name-servers",
>                     "code": 6,
>                     "space": "dhcp4",
>                     "csv-format": true,
>                     "data": "10.0.2.1"
>                 }
>             ]
>         }
>     ]
> }       ...
> 
> 
> Not elegant, true, but it seems to make maximum use of what we already have.
> 

I think it is hackish and not acceptable in the long run. It will put a
serious overhead on administrator to copy paste all these subnets to
define new classes. And, who knows how many classes there will be. Also,
it would require modification to how we parse subnets configuration
because each subnet instance comes with its own, usually self generated
subnet id, so they would actually get different ids.

I think the proper way to do it is to, unfortunately, add a "class" map
into the subnet structure and for each of them hold options specific to
this class.

Marcin