[kea-dev] Some thoughts about client classification

Tue Sep 22 12:25:53 UTC 2015

On 22/09/15 10:12, Marcin Siodelski wrote:
> On 21.09.2015 21:08, Stephen Morris wrote:

> For comparisons involving a single option field compared against a
> specific value (e.g. integer) I don't see why we couldn't try to convert
> the value specified in the definition of the class to the type held in
> the option. For example, if the specific option is an array of unsigned
> integers, this option is represented by the OptionInt class. This class
> holds the array of integers internally. This class should expose
> comparison method(s) which would accept the index of the field within
> the option to compare to, and the value defined in the client class as a
> string. So for example:
> 
> OptionInt<uint32_t>::compare(unsigned int field_index, const
> std::string& op_type, const std::string& value);
> 
> This function would know which type the given field (specified by the
> field_index) has and would lexical_cast value to this type for comparison.
> 
> For the match operation it would probably do it on strings as you
> propose, but for others it makes more sense to convert value to the
> specific type, rather than the other way around because if someone types
> "01" instead of "1" it wouldn't pass the test if compared on strings.

If that is easy to do, I'm OK with that.  I only suggested strings in an
attempt to simplify implementation.

> I'd also think that for Kea 1.0 we could restrict the number of
> supported operations to "eq" and "match". We could also consider "ne".

That would be OK (although "eq" is not strictly necessary, as "match" to
"^string$" is the equivalent).  It does strikes me that we would also
want a "not-match" operator.

>> "test": ["chaddr", "eq", "08002b02deadbeef"]
>> ... matches if the hardware address (in the "chaddr" field) is that
>> specified.
>>
>> "test": ["chaddr[0:8]", "eq", "08002b02"]
>> As above but matching all clients if their hardware address starts with
>> the string specified.
>>
> 
> One has to note that chaddr is not an option but the field in the
> packet, so it would require some special code paths. I think that in the
> first step we don't require classification based on the contents of the
> chaddr because this is what host reservation is intended to do, with its
> own semantics.

My thought here was that for simplicity for the user, the fields in the
packet are accessed in the same way as options.  But you are right, the
parsing has to identify that the name is a field in the packet and take
a special code path.  We could perhaps leave this out for the first release.

> 
>> "test": ["vendor-class-identifier", "match", "foo"]
>> Matches if the vendor class identifier option contains the string "foo"
>> somewhere in it.
>>
>> "test": ["vendor-class-identifier", "match", "bar$"]
>> Matches if the vendor class identifier ends with the letters "bar".
>>
>>
> 
> I think we also need some example for comparison of some specific option
> fields, which seems to be quite frequent use case. So, rather than
> matching the whole option we may sometimes do:
> 
> "test": ["vendor-class-identifier[2], "eq", "docsis3.0"]
> 
> where [2] is an index of the option field. Note that we don't track
> names of the fields.
> 
> One also has to note that the options have suboptions. These options
> have to be referenced somehow. Maybe we don't need to do it for 1.0 but
> some form of encapsulation notation would be needed.

Again I was trying to keep things simple.  But if we can manage "[n]" to
indicate an element of an array as well as "[n:m]" to indicate a
substring, that would be good.  But we are likely to have to deal with
constructs such as:

   option[n].suboption[i:j]

... which will complicate parsing.

With regular expression matching we might be able to get away without
needing to allow a substring specification in the first implementation, as

   "option[3:5]", "eq", "abc"

... could also be written as

   "option", "match", "^...abc"

>> The Kea manual already illustrates how a specific address can be
>> returned (by incorporating the "client-class" keyword in a "subnet"
>> clause (section 7.2.13.1). As a subnet within a subnet{4,6} clause can
>> already contain options that override the global data, maybe the
>> quickets way to allow the choice of options based on class is to permit
>> overlapping subnets and pools within the subnet clauses.  This is
>> illustrated in the fuller example, below:
>>
>> (example omitted)
> 
> I think it is hackish and not acceptable in the long run. It will put a
> serious overhead on administrator to copy paste all these subnets to
> define new classes. And, who knows how many classes there will be. Also,
> it would require modification to how we parse subnets configuration
> because each subnet instance comes with its own, usually self generated
> subnet id, so they would actually get different ids.
> 
> I think the proper way to do it is to, unfortunately, add a "class" map
> into the subnet structure and for each of them hold options specific to
> this class.

You are probably right, I was focusing on getting something running, and
this is probably not good in the long-term.

If we are doing this though, we have to do it properly, which implies
that we need a class map at the global level to override global options
and a class map at the subnet level to override subnet options.

Adding a couple of ideas that occurred to me since the last email, how
about:

Client class definition
---

"client-classes": [
    {
        "name": "class-name",
        "combine": "and",
	"tests": [
	   [ selector, op, value ],
           [ selector, op, value ], ...
        ]
    },
    {
        "name": "class-name-2",
        "combine": "or",
        "tests": [
            [ selector, op, value ],
            [ selector, op, value ], ...
        ]
    },
]

The modification here is that each class definition has one or more
tests and there is an extra field that tells how the tests are combined
for the classification to succeed.  If omitted, the combination defaults
to "and".

Note that the "or" could also be implemented by repeating the
classification element, i.e.

"client-classes": [
   {
      "name": "foo",
      "combine": "or",
      "tests": [
         [ selector1, op1, value1 ],
         [ selector2, op2, value2 ], ...
      ]
   },
]

... is equivalent to

"client-classes": [
    {
        "name": "foo",
        "tests": [
            [ selector1, op1, value1 ]
        ]
    },
    {
        "name": "foo",
        "tests": [
            [ selector2, op2, value2 ]
        ]
    }
]

I think it would be more effort to prohibit that alternative
representation than to allow it.  In fact, allowing multiple definitions
of a class also allows the creation of more complicated logical
expressions than simple AND and OR.

Client class use
---
The tricky thing is to add a syntax that is compatible with what we have
at the moment.  Currently, options are defined in the "option-data"
array. either at the server level or at the subnet level.

I suggest adding a "class-option-data" array of the form:

"class-option-data": [
    {
        "client-class": "class-name",
        "option-data": [ ... ]
    }
]

... which associates an option-data definition with a class.  This can
be added to both the top-level and the subnet declarations to provide
class-specific option data.

The search path for an option value is then:

1. Subnet class-option-data matching the client class
2. Subnet option-data
3. Global class-option-data matching the client class
4. Global option-data

As an example, consider the following configuration (apologies for the
length).  It may be easier to read the notes at the bottom then look at
the relevant parts of the configuration.

"Dhcp4": {

# (Class definitions have been omitted from this example.)

    "option-data": [
        {
            "name": "domain-name-servers",
            "data": "192.0.2.3"
        }
    ],
    "class-option-data": [
        {
            "client-class": "beta",
            "option-data": [
                {
                    "name": "domain-name-servers",
                    "data": "10.0.0.1"
                }
            ]
        },
        {
            "client-class": "gamma",
            "option-data": [
                {
                    "name": "time-servers",
                    "data": "10.2.3.4"
                }
            ]
        }
    ],
    "subnet4": [
        {
            "subnet": "192.0.2.0/24",
            "client-class": "alpha"
        },
        {
            "subnet": "192.0.3.0/24",
            "client-class": "beta"
        },
        {
            "subnet": "192.0.4.0/24",
            "client-class": "gamma"
        },
        {
            "subnet": "192.0.5.0/24",
            "client-class": "delta",
            "option-data": [
                {
                    "name": "domain-name-servers",
                    "data", "10.0.0.2"
                }
             ]
        },
        {
             "subnet": "192.0.6.0/24",
             "option-data": [
                  {
                       "name": "domain-name-servers",
                       "data", "10.0.0.3"
                  }
             ],
             "class-option-data": [
                  {
                       "client-class": "epsilon",
                       "option-data": [
                           {
                              "name": "domain-name-servers",
                              "data": "10.0.0.4"
                           }
                       ]
                  }
             ]
        }
    ]
}

(Extraneous option definition information omitted.)

The logic for picking up the options is that if a client is classified as:

* "alpha": The subnets are searched sequentially, and this class matches
the class restriction of the first subnet 192.0.2.0/24.  As there is no
option definition in the "subnet" clause, Kea checks the global options.
 The only class option definitions are for the class "beta", so it picks
up the global option definitions, giving the DNS server address as
192.0.2.3.

* "beta": The second subnet (192.0.3.0/24) is selected with the class
matching.  Again there are no option definitions in the subnet, so the
global options are searched.  There are global class option definitions
for "beta", so those options are picked up, giving the DNS server as
10.0.0.1.  The generic global options are checked but as the only option
defined is for domain-name-servers - which Kea has already found - that
definition is ignored.

* "gamma": The subnet 192.0.4.0/24 is selected.  Like "beta" there is a
global class option definition for this class, so "time-servers" is set
to 10.2.3.4.  Unlike "beta" though, when the global option-data is
examined, Kea finds domain-name-servers defined.  This is an option that
has not already been found, so in addition to "time-servers", the option
"domain-name-servers" is picked up with a value of 192.0.2.3.

* "delta": domain-name-servers is defined in the matching subnet
definition, so the value of 10.0.0.2 is used.  No other options are
defined in the search path, so that is the only option picked up.

* "epsilon": Kea will settle on the subnet 192.0.6.0/24 as all the other
subnets are restricted to other classes.  Within this subnet, there is a
class option definition for domain-name-servers matching the class
"epsilon" so the defined value 10.0.0.4 is used.  There are no other
options in the search path, so this is the only option returned.

* Other classes: again the subnet 192.0.6.0/24 is used as that is the
only one that matches.  The client-option-data clause in that subnet
definition only matches "epsilon", so other classes will use the
"option-data" clause and the value of domain-name-servers as 10.0.0.3.
Again, as there are no other options in the search path, this the only
option used.

Stephen