DHCP Hackaton summary

Mon Sep 22 13:50:58 UTC 2014

Here's the DHCP Hackaton summary. These were the discussions that I
participated in. Couple times several things were happening in parallel,
so I'll let others to fill in the blanks.

We had a very interesting DHCP hackaton last week co-organized by ISC
and a Department of Computer Communications of ETI faculty of Gdansk
University of Technology. There were 14 participants with varied
background - undergrad students, lecturers, professors, network
administrators and software developers. Two participants had to drive
from almost the other end of the country to get to Gdansk. Thanks for
your dedication. We were even visited by a vice-dean of the faculty!

This was a two day event where DHCP developers, users, students and
other interested people had a chance to talk, discuss bugs, fix them,
write new code and design new features together.

Hackaton had started with slightly over 2 hours long participants
introduction, DHCP introduction (about the protocol, but also its
implementations Kea and Dibbler) and overview of ISC engineering
processes, including trac and github repository.

Several documentation issues were reported and fixed (thanks to Adam
Osuchowski and Piotr Strzyzewski for reports and fixes). Couple logging
issues were reported. Fortunately, ISC was aware of them and in two
cases we already had branches with proposed fixes that are currently
undergoing reviews. It's always useful to get a hands on confirmation if
a given solution is working. Especially when it doesn't :) Fortunately,
the code was promptly fixed.

We started using Kea repository on github and managed to merge our first
pull request! Yay!

We had a discussion about implementing minimalistic server and client
for embedded environments, based on libdhcp++ from Kea. Dominik Zeromski
and Tomek Kleinschmidt reported that they managed to trim down libdhcp++
to 400kb, but with additional steps it is possible to further trim it
down to 200kb. One issue was dependency on libutil library. That
dependency is rather minor - only OutputBuffer class is really needed.
We discussed various possible solutions - moving OutputBuffer to
libdhcp++, splitting OutputBuffer into its own small library etc.

Another roadblock on making libdhcp++ is exception handling code.
Exceptions cause compiler to generate non-trivial amount of additional
output binary code. We had a discussion about it and the most feasible
way forward would be to split isc_throw macro into isc_throw and
isc_throw_fatal. The latter would be replaced by calling assert and
terminating application. The former would be silently ignored. Those new
macros will be used when compiling for embedded environment. There will
be an additional target for that (libdhcp++-minimal?). Obviously, the
regular target will remain and exceptions will be used in normal
environments (both isc_throw and isc_throw_fatal will simply throw
exceptions).

Second day was focused on host reservation design. This is an upcoming
feature for 0.9.1. After a lengthy discussion, we came up with the
following design:

1. Host reservations will be defined in config file (the default mode),
but it will possible to also store them in a database.

2. Proposed table layout is as follows:

Hosts table
- identifier (varchar) - container for client identifier. For now, it
  will be either MAC or DUID, but soon will be adding additional
  identifiers, like subscriber-id, remote-id etc.
- identifier type - specifies type used in identifier (this will
  essentially be an enum with values of TYPE_MAC, TYPE_DUID etc.)
- ip4 - an IPv4 address (this field may be null)
- ip6 - list of IPv6 addresses (may be null, one or several addresses)
- ip6prefix - list of IPv6 prefixes (null, one or several prefixes)
- hostname (optional)
- clientclass (this field will be added at a later stage)
- host-id (a primary key)

We need an unified way to reserve hosts by MAC, DUID, subscriber-id,
remote-id, docsis options etc. The design above allows that. It is also
easily extensible. If we decide to add a support for a new identifier,
it's only a matter of adding one more enum (database wise, it will be
more involved DHCP engine wise, but that complexity is an independent
problem).

ip4 address is optional as it may be IPv6 only reservation or maybe
mac+hostname reservation.

ip6 and ip6prefix fields may have zero (IPv4 only reservation or maybe
mac+hostname only), one (typical reservation) or more than one address
or prefix. DHCPv6 spec allows for a single client to handle more than
one address and more than one prefix (client sends several instances of
IA_NA and/or IA_PD options)

The hostname is obviously optional.

Clientclass is an optional field. It allows to specify that a given host
belongs to a client class. That is important for at least two reasons.
First, we may get per host client classification granularity. Second, if
there is 100 hosts that require an option, we may want to add one option
and say that it's for class X, rather than adding 100 instances of
exactly the same option. It seems that client class will be implemented
at a later stage (or perhaps implemented along the rest of the fields,
but will be ignored for now).

host-id will be required to be able to refer for specific host instance.
It is not clear yet if we will need it for memfile backend.

3. Multiple host reservations
Since we'll be able to reserve host based on mac, DUID and others
parameters, we can't enforce them to be unique. Therefore it may be
possible that due to misconfiguration there will be several reservations
that will match. It would be bad if the server randomly used one or the
other. We need to analyze 3 cases here: 0 hosts (carry on as usual), 1
host (use that reservation), more than one (log error).
The easiest way would be to do SELECT with LIMIT 2. We're not really
interested in how many are there, just if there is anything more than
one. We will then log warning and probably use whatever MySQL or
Postgres returned as first host.

4. Options in DB
We will also need a table for storing options in database. The immediate
need is to allow storing options for a specific host, but soon after
that we'll need options associated with a given class or with a given
subnet (once we get the capability to store subnets in a DB).

The proposed table looks as follows:

Options table
- option code (uint16, option code in DHCPv4 or DHCPv6 sense)
- value string (options have diverse formats. using string will allow
  values to be stored in an uniform way, e.g. for format uint32, bool,
  string we'll store something like 1234,true,"hello world".
- option space (see explanation below)
- host-id - foreign key (a reference to hosts table)
- client-class string (explanation below)
- subnet-id uint32_t

option space will be the same as we have in Kea now. There are 2 defined
out of box: "dhcp4" and "dhcp6", but we can easily define vendor option
namespaces. Also, for custom options it is now possible to define that
option A belongs to "dhcp4" and contains option space "foo" and options
B and C belong to option space "foo".

It is understood and agreed that for any custom or vendor options, there
will be definitions needed. Those will be kept in the config file.

Client-class will be an optional field that will specify that a given
option is for hosts that belong to that class.

Subnet-id performs similar selective role for subnets.

Host-id, client-cass and subnet-id can be considered selectors, i.e.
properties that will be used to find appropriate options. There was a
discussion on how the query should look like:

host-id=X AND client-class=Y AND subnet-id=X

host-id=X OR client-class=Y OR subnet-id=X

There was also a brief discussion how MySQL treats NULLs, as not all
values will be specified at all times. This particular topic will need
further thoughts.

5. Hosts cached or not
In theory, hosts can be kept either in memory or in database only. In
memory approach, hosts information is read from the DB and kept in
memory. This is good from performance perspective, but requires
administrative action (admin needs to tell the server that hosts
reservations changed) every time hosts information has changed in the
DB. On the other hand, it is possible to not keep anything in memory and
check with the DB every time. This will be less performant (extra query
for each packet processing), but will have the benefit of being always
up to date (any changes to the DB will be immediately picked up).

Adam proposed a mixed mode, where the server has a cache that is
gradually filled with data that is read from the DB. This is an
interesting proposal as it gives fast startup and ultimately high
performance. We will likely implement it one day, but it is probably not
feasible in 0.9.1 timeframe. Nevertheless, we need to make the design
extensible to allow adding it at some later time.

6. Required and proposed commands
If we implement hosts kept in the memory strategy, then we'll need at
least the following commands:
- delete-host - deletes one host. It would be useful to use
identifier/identifier type as parameters for consistency
- flush-case - drops existing list of hosts from the in-memory cache
- reload-hosts - drop all hosts from memory and reload from DB

We also talked about reloading hosts from DB. In typical case, there
will be thousands of mostly static host reservations, with a handful new
ones being added or deleted. Naive approach of dropping all from memory
and reloading will cause the server to reload mostly the same with
minimal delta, which will be a performance problem. It may be useful to
has one extra command: add-host. Full implications are not yet fully
understood, though. The behavior must be well defined: is add-host only
adds a host to in-memory cache and will be lost after restart? Or will
it also insert the host into DB?

7. Feature request
One missing feature is config file inclusion. Adam requested this
capability as it is very commonly used. Kea doesn't have such capability
yet. It is supported by ISC DHCP, though.

8. DHCPv6 Bulk leasequery in Dibbler
Damian Manelski is working on bulk leasequery support in Dibbler.
Dibbler server was extended with TCP sockets and bulk leasequery
processing. There are bits and pieces still missing, though. Several
bulk leasequery scenarios rely on options inserted by relay agants. In
particular, on remote-id and relay-id options. ISC DHCP does not insert
those options. Dibbler inserts only remote-id, but not relay-id. We
managed to implement support for relay-id. Even though this is not
directly related to Kea, it may still be useful as a testing tool, even
before we start implementing bulk leasequery in Kea. Even as of today,
we do have code for extracting options inserted by relays. Using
remote-id or relay-id for testing its correctness may be useful.

9. MAC addresses support in DHCPv6
Adam Kalmus started work on extracting MAC addresses from DUID. This is
one of the tickets we planned for 0.9.1. This work is in its early
stages, but the environment was set up and first lines of code were
written. That's a good start. We're keeping our fingers crossed.

No slides were shown during the event.

Thanks to doc. Krzysztof Nowicki and Department of Computer
Communications from Gdansk University of Technology for hosting us!

Tomek Mrugalski
ISC