RENEWING v4 client + IOS unnumbered interface relay + ISC DHCP server + failover = trouble
David Sacerdote
dsacer at cisco.com
Thu Apr 26 21:37:36 UTC 2007
When providing DHCP relay service on unnumbered interfaces, the Cisco
IOS DHCP relay, rather than allowing clients to send DHCPREQUEST
messages directly to the DHCP server, modifies messages from the DHCP
server to the client to insert its own address, and then relays all
subsequent messages to the DHCP server, thereby enabling it to determine
when the lease ends, so that a route inserted when the lease is created
can be removed when it ends.
Unfortunately, this has an interesting side effect: when a DHCPv4 client
is in the RENEWING state, if the Cisco IOS DHCP relay is configured with
more than one helper-address, DHCPREQUST messages are sent to multiple
DHCP servers, something which version 3.0.5 of the ISC DHCP server seems
to assume does not ever happen:
From servers/dhcp.c:
void dhcprequest (packet, ms_nulltp, ip_lease)
...
* If it's RENEWING, we are the only server to hear it, so
* we have to serve it.
This means that when there are two ISC DHCP Servers in failover mode,
and both are up, clients will end up seeing two ACK messages whenever
they send out a DHCPREQUEST message. This wouldn't be so bad, except
that there are higher-layer services which are depending on the state of
a DHCP lease to establish a session concept, and the ACK message from
the wrong server makes things look like the client went into a REBINDING
state.
Modifying Cisco IOS seems to only be a partial solution here: some DHCP
clients (notably the Cisco one) seem to like broadcasting their
DHCPREQUEST messages irrespective of whether they're in a RENEWING or a
REBINDING state, and the DHCP broadcast flag is similarly unreliable. I
have rigged a version of the IOS DHCP relay to try and track whether the
client got a response from the previous server it tried, but that's a
recipe for any kind of packet loss to trigger the problem again.
So does anybody have suggestions about how one would go about modifying
the ISC DHCP server to check whether its failover peer is up, and to not
respond to DHCPREQUEST messages intended for its peer if that peer is
indeed functional? Is there some major pitfall that I should be aware of?
More information about the dhcp-hackers
mailing list