[kea-dev] Problem with subnet4_select and lease4_select behaviour with Kea 1.1 and 1.3

Dave Cole davecole at nbnco.com.au
Fri Nov 17 01:01:26 UTC 2017


Sorry for the length of this message, but I wanted to be clear in describing the problem we are seeing.

We have been trying to do something with Kea that is similar to what Facebook describes in this presentation:

https://www.usenix.org/conference/srecon15europe/program/presentation/failla

Basically they stub out subnet4_select and lease4_select so that Kea does nothing for those steps, then they mathematically determine the subnet and ip address values and poke them in during pkt4_send.

This allows them to deploy horizontally scaled stateless DHCP servers.

Our requirements are similar, but we cannot perform a simple mathematical calculation.  I have built a hook extension with an embedded Python interpreter.  The allows us to do some fancy stuff with PostgreSQL and Kafka by simply importing some Python libraries.

A simple representation of the Facebook trick looks like this:

- - 8< - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
import kea
import ipaddress

pkt_type = dict([(v, k[4:])
                 for k, v in kea.__dict__.iteritems()
                 if k.startswith('DHCP')])
subnet = ipaddress.IPv4Network(u'10.0.0.0/20')

def pkt4_receive(handle):
    query = handle.getArgument('query4')
    kea.log('pkt4_receive - query is %s' % pkt_type[query.getType()])
    option82 = query.getOption(kea.DHO_DHCP_AGENT_OPTIONS)
    if option82 is None:
        kea.log('pkt4_receive - option82 not found - drop transaction')
        handle.setStatus(kea.NEXT_STEP_SKIP)
    return 0

def subnet4_select(handle):
    query = handle.getArgument('query4')
    kea.log('subnet4_select - query is %s' % pkt_type[query.getType()])
    handle.setStatus(kea.NEXT_STEP_SKIP)
    return 0

def lease4_select(handle):
    query = handle.getArgument('query4')
    kea.log('lease4_select - query is %s' % pkt_type[query.getType()])
    handle.setStatus(kea.NEXT_STEP_SKIP)
    return 0

def pkt4_send(handle):
    response = handle.getArgument('response4')
    kea.log('pkt4_send - response is %s' % pkt_type[response.getType()])
    response.setYiaddr(str(subnet[42]))
    response.setSiaddr(str(subnet[1]))
    response.delOption(kea.DHO_SUBNET_MASK)
    response.addOption(kea.Option(kea.DHO_SUBNET_MASK).setString(subnet.netmask.packed))
    response.delOption(kea.DHO_DHCP_LEASE_TIME)
    response.addOption(kea.Option(kea.DHO_DHCP_LEASE_TIME).setUint32(3600))
    if response.getType() == kea.DHCPACK:
        response.delOption(kea.DHO_DHCP_RENEWAL_TIME)
        response.addOption(kea.Option(kea.DHO_DHCP_RENEWAL_TIME).setUint32(1800))
        response.delOption(kea.DHO_DHCP_REBINDING_TIME)
        response.addOption(kea.Option(kea.DHO_DHCP_REBINDING_TIME).setUint32(3150))
    return 0
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

We use some values in Option 82 to drive our "static" allocation strategy, so any received packet without Option 82 can just be discarded.

The problem is, it does not work.  As soon as subnet4_select tells Kea to skip the step it logs the following and stops.

The way we are requesting an address is with the dhtest program.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2017-11-16 23:37:22.885 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUTS_BEGIN begin all callouts for hook pkt4_receive
2017-11-16 23:37:22.885 INFO  [kea-dhcp4.kea-python/3554] PYTHON_LOG pkt4_receive_proxy
2017-11-16 23:37:22.885 INFO  [kea-dhcp4.kea-python/3554] PYTHON_LOG pkt4_receive - query is DISCOVER
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook pkt4_receive that has address 0x7f901f1b4dee (callout duration: 0.245 ms)
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUTS_COMPLETE completed callouts for hook pkt4_receive (total callouts duration: 0.245 ms)
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.dhcpsrv/3554] DHCPSRV_CFGMGR_SUBNET4_ADDR selected subnet 0.0.0.0/0 for packet received by matching address 100.100.100.100
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUTS_BEGIN begin all callouts for hook subnet4_select
2017-11-16 23:37:22.886 INFO  [kea-dhcp4.kea-python/3554] PYTHON_LOG subnet4_select - query is DISCOVER
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook subnet4_select that has address 0x7f901f1b4eed (callout duration: 0.080 ms)
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.callouts/3554] HOOKS_CALLOUTS_COMPLETE completed callouts for hook subnet4_select (total callouts duration: 0.080 ms)
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.hooks/3554] DHCP4_HOOK_SUBNET4_SELECT_SKIP [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x581626b4: no subnet was selected, because a callout set skip flag.
2017-11-16 23:37:22.886 DEBUG [kea-dhcp4.ddns/3554] DHCP4_CLIENT_HOSTNAME_PROCESS [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x581626b4: processing client's Hostname option
2017-11-16 23:37:22.886 ERROR [kea-dhcp4.bad-packets/3554] DHCP4_PACKET_NAK_0001 [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x581626b4: failed to select a subnet for incoming packet, src 0.0.0.0, type DHCPDISCOVER
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

In response to that I removed the subnet4_select callout and allowed the server to process to lease4_select.  This time it fails with a DHCP4_PACKET_NAK_003 in the following way:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.alloc-engine/3590] ALLOC_ENGINE_V4_OFFER_NEW_LEASE allocation engine will try to offer new lease to the client [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x7fae45fa
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.hosts/3590] HOSTS_CFG_GET_ONE_SUBNET_ID_ADDRESS4 get one host with reservation for subnet id 1 and IPv4 address 0.0.0.1
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.hosts/3590] HOSTS_CFG_GET_ALL_ADDRESS4 get all hosts with reservations for IPv4 address 0.0.0.1
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.hosts/3590] HOSTS_CFG_GET_ALL_ADDRESS4_COUNT using address 0.0.0.1, found 0 host(s)
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.hosts/3590] HOSTS_CFG_GET_ONE_SUBNET_ID_ADDRESS4_NULL host not found using subnet id 1 and address 0.0.0.1
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.dhcpsrv/3590] DHCPSRV_MEMFILE_GET_ADDR4 obtaining IPv4 lease for address 0.0.0.1
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.callouts/3590] HOOKS_CALLOUTS_BEGIN begin all callouts for hook lease4_select
2017-11-16 23:40:51.770 INFO  [kea-dhcp4.kea-python/3590] PYTHON_LOG lease4_select - query is DISCOVER
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.callouts/3590] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook lease4_select that has address 0x7faf0fe58f80 (callout duration: 0.030 ms)
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.callouts/3590] HOOKS_CALLOUTS_COMPLETE completed callouts for hook lease4_select (total callouts duration: 0.030 ms)
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.dhcpsrv/3590] DHCPSRV_HOOK_LEASE4_SELECT_SKIP Lease4 creation was skipped, because of callout skip flag.
2017-11-16 23:40:51.770 WARN  [kea-dhcp4.alloc-engine/3590] ALLOC_ENGINE_V4_ALLOC_FAIL [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x7fae45fa: failed to allocate an IPv4 address after 0 attempt(s)
2017-11-16 23:40:51.770 DEBUG [kea-dhcp4.bad-packets/3590] DHCP4_PACKET_NAK_0003 [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x7fae45fa: failed to advertise a lease, client sent ciaddr 0.0.0.0, requested-ip-address (no address)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

So the only way I can get it to work is with the following code:

- - 8< - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
import kea
import ipaddress

pkt_type = dict([(v, k[4:])
                 for k, v in kea.__dict__.iteritems()
                 if k.startswith('DHCP')])
subnet = ipaddress.IPv4Network(u'10.0.0.0/20')

def pkt4_receive(handle):
    query = handle.getArgument('query4')
    kea.log('pkt4_receive - query is %s' % pkt_type[query.getType()])
    option82 = query.getOption(kea.DHO_DHCP_AGENT_OPTIONS)
    if option82 is None:
        kea.log('pkt4_receive - option82 not found - drop transaction')
        handle.setStatus(kea.NEXT_STEP_SKIP)
    return 0

def lease4_select(handle):
    query = handle.getArgument('query4')
    kea.log('lease4_select - query is %s' % pkt_type[query.getType()])
    if query.getType() == kea.DHCPREQUEST:
        kea.debug('lease4_select - do not store lease')
        handle.setStatus(kea.NEXT_STEP_SKIP)
    return 0

def pkt4_send(handle):
    response = handle.getArgument('response4')
    kea.log('pkt4_send - response is %s' % pkt_type[response.getType()])
    if response.getType() == kea.DHCPNAK:
        kea.log('pkt4_send - kea wants to NAK, override to ACK')
        response.setType(kea.DHCPACK)
    response.setYiaddr(str(subnet[42]))
    response.setSiaddr(str(subnet[1]))
    response.delOption(kea.DHO_SUBNET_MASK)
    response.addOption(kea.Option(kea.DHO_SUBNET_MASK).setString(subnet.netmask.packed))
    response.delOption(kea.DHO_DHCP_LEASE_TIME)
    response.addOption(kea.Option(kea.DHO_DHCP_LEASE_TIME).setUint32(3600))
    if response.getType() == kea.DHCPACK:
        response.delOption(kea.DHO_DHCP_RENEWAL_TIME)
        response.addOption(kea.Option(kea.DHO_DHCP_RENEWAL_TIME).setUint32(1800))
        response.delOption(kea.DHO_DHCP_REBINDING_TIME)
        response.addOption(kea.Option(kea.DHO_DHCP_REBINDING_TIME).setUint32(3150))
    return 0
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Looking at the log file I now see this, and the client gets the correct address:

$ sudo dhtest -i eth1 -g 100.100.100.100 -c 82,str,hello
DHCP discover sent	 - Client MAC : 08:00:27:95:e8:a1
DHCP offer received	 - Offered IP : 10.0.0.42
DHCP request sent	 - Client MAC : 08:00:27:95:e8:a1
DHCP ack received	 - Acquired IP: 10.0.0.42

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2017-11-17 00:45:06.140 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUTS_BEGIN begin all callouts for hook lease4_select
2017-11-17 00:45:06.140 INFO  [kea-dhcp4.kea-python/3763] PYTHON_LOG lease4_select - query is REQUEST
2017-11-17 00:45:06.140 DEBUG [kea-dhcp4.kea-python/3763] PYTHON_LOG lease4_select - do not store lease
2017-11-17 00:45:06.140 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook lease4_select that has address 0x7f277a87bf80 (callout duration: 0.249 ms)
2017-11-17 00:45:06.140 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUTS_COMPLETE completed callouts for hook lease4_select (total callouts duration: 0.249 ms)
2017-11-17 00:45:06.140 DEBUG [kea-dhcp4.dhcpsrv/3763] DHCPSRV_HOOK_LEASE4_SELECT_SKIP Lease4 creation was skipped, because of callout skip flag.
2017-11-17 00:45:06.141 DEBUG [kea-dhcp4.bad-packets/3763] DHCP4_PACKET_NAK_0004 [hwtype=1 08:00:27:95:e8:a1], cid=[no info], tid=0x52a89446: failed to grant a lease, client sent ciaddr 0.0.0.0, requested-ip-address 10.0.0.42
2017-11-17 00:45:06.141 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUTS_BEGIN begin all callouts for hook pkt4_send
2017-11-17 00:45:06.141 INFO  [kea-dhcp4.kea-python/3763] PYTHON_LOG pkt4_send - response is NAK
2017-11-17 00:45:06.141 INFO  [kea-dhcp4.kea-python/3763] PYTHON_LOG pkt4_send - kea wants to NAK, override to ACK
2017-11-17 00:45:06.141 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUT_CALLED hooks library with index 1 has called a callout on hook pkt4_send that has address 0x7f277a87c1cc (callout duration: 0.680 ms)
2017-11-17 00:45:06.141 DEBUG [kea-dhcp4.callouts/3763] HOOKS_CALLOUTS_COMPLETE completed callouts for hook pkt4_send (total callouts duration: 0.680 ms)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The problem with this solution is that Kea still seems to be allocating IP addresses from the subnet.

According to the hook documentation, the Facebook trick should still work and my work-around should not be necessary.

Can anyone explain why the Facebook trick does not work anymore?  We have tried version 1.1 and 1.3 with the same results.

If anyone is interested, we intend to contribute our hook back to the commnunity.  For now the API coverage is restricted to the functionality that we require, but it would not be hard to follow the pattern of the existing code to expose more Kea object types.


More information about the kea-dev mailing list