Failback causes lost lease

Gregory Sloop gregs at sloop.net
Fri Jun 26 01:15:26 UTC 2015



On 06/25/2015 01:39 PM, Gregory Sloop wrote:
Re: Failback causes lost lease 

Gregory,

Thanks for your reply.

On 06/25/2015 12:47 PM, Gregory Sloop wrote:
Re: Failback causes lost lease SM> In testing my dhcp failover, I pulled the ethernet cable on the primary
SM> server.

SM> The secondary server acknowleged renewal requests as expected.

SM> Then I plugged the cable back in. After both the primary and secondary
SM> had moved from communications-interrupted to normal, the secondary logs
SM> multiple dhcp requests from a client whose lease is owned by the primary
SM> server. The primary server does not log any of these but the last 
SM> request, reporting that "lease in transition state expired".

SM> Then the secondary server logs a DHCPDISCOVER from that client and 
SM> records it load balancing to the primary server.

SM> The primary server also sees the DHCPDISCOVER and offers a new lease 
SM> that is not the same number as the previous lease. This despite the old
SM> number not having been reassigned.

SM> The end result is that failback causes my clients to change their ip 
SM> address.

SM> Why does this happen and how can I prevent it?

SM> _______________________________________________
SM> dhcp-users mailing list
SM> dhcp-users at lists.isc.org
SM> https://lists.isc.org/mailman/listinfo/dhcp-users

1) Logs would be good.
Primary Server:
=========

Jun 25 11:33:00 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:33:00 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:36:23 dhcpd: timeout waiting for failover peer partner
Jun 25 11:36:23 dhcpd: peer partner: disconnected
Jun 25 11:36:23 dhcpd: failover peer partner: I move from normal to communications-interrupted
Jun 25 11:42:38 dhcpd: failover peer partner: peer moves from normal to communications-interrupted
Jun 25 11:42:38 dhcpd: failover peer partner: I move from communications-interrupted to normal
Jun 25 11:42:38 dhcpd: balancing pool 7f4c9a14cb80 10.112.0.0/13  total 1024  free 511  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f4c9a14cb80 10.112.0.0/13  total 1024  free 511  backup 512  lts 0  max-misbal 153
Jun 25 11:42:38 dhcpd: balancing pool 7f4c9a143260 10.104.0.0/13  total 1024  free 512  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f4c9a143260 10.104.0.0/13  total 1024  free 512  backup 512  lts 0  max-misbal 154
Jun 25 11:42:38 dhcpd: balancing pool 7f4c9a13a930 10.96.0.0/13  total 1024  free 512  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f4c9a13a930 10.96.0.0/13  total 1024  free 512  backup 512  lts 0  max-misbal 154
Jun 25 11:42:38 dhcpd: failover peer partner: peer moves from communications-interrupted to normal
Jun 25 11:43:38 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112: lease in transition state expired
Jun 25 11:43:39 dhcpd: DHCPDISCOVER from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:43:40 dhcpd: DHCPOFFER on 10.112.2.201 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:43:40 dhcpd: uid lease 10.112.1.1 for client 00:19:d1:4d:03:f8 is duplicate on 10.112.0.0/13
Jun 25 11:43:40 dhcpd: DHCPREQUEST for 10.112.2.201 (10.112.0.104) from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:43:40 dhcpd: DHCPACK on 10.112.2.201 to 00:19:d1:4d:03:f8 via em2.112

Secondary Server:
===========
Jun 25 11:36:21 dhcpd: timeout waiting for failover peer partner
Jun 25 11:36:21 dhcpd: peer partner: disconnected
Jun 25 11:36:21 dhcpd: failover peer partner: I move from normal to communications-interrupted
Jun 25 11:37:34 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:37:34 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:39:10 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:39:10 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:40:11 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:40:11 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:40:44 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:40:44 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:41:16 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:41:16 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:41:47 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:41:47 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:42:12 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:42:12 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:42:37 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:42:37 dhcpd: DHCPACK on 10.112.1.1 to 00:19:d1:4d:03:f8 via em2.112
Jun 25 11:42:38 dhcpd: failover peer partner: peer moves from normal to communications-interrupted
Jun 25 11:42:38 dhcpd: failover peer partner: I move from communications-interrupted to normal
Jun 25 11:42:38 dhcpd: balancing pool 7f325dbfdb30 10.112.0.0/13  total 1024  free 511  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f325dbfdb30 10.112.0.0/13  total 1024  free 511  backup 512  lts 0  max-misbal 153
Jun 25 11:42:38 dhcpd: balancing pool 7f325dbf4210 10.104.0.0/13  total 1024  free 512  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f325dbf4210 10.104.0.0/13  total 1024  free 512  backup 512  lts 0  max-misbal 154
Jun 25 11:42:38 dhcpd: balancing pool 7f325dbeb8e0 10.96.0.0/13  total 1024  free 512  backup 512  lts 0  max-own (+/-)102
Jun 25 11:42:38 dhcpd: balanced pool 7f325dbeb8e0 10.96.0.0/13  total 1024  free 512  backup 512  lts 0  max-misbal 154
Jun 25 11:42:38 dhcpd: failover peer partner: peer moves from communications-interrupted to normal
Jun 25 11:43:06 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112: lease owned by peer
Jun 25 11:43:13 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112: lease owned by peer
Jun 25 11:43:24 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112: lease owned by peer
Jun 25 11:43:38 dhcpd: DHCPREQUEST for 10.112.1.1 from 00:19:d1:4d:03:f8 via em2.112: lease owned by peer
Jun 25 11:43:39 dhcpd: DHCPDISCOVER from 00:19:d1:4d:03:f8 via em2.112: load balance to peer partner
Jun 25 11:43:40 dhcpd: DHCPREQUEST for 10.112.2.201 (10.112.0.104) from 00:19:d1:4d:03:f8 via em2.112: lease owned by peer

2) I think something with your config is broken. If I were to [wildly] guess, it's a physical/network layer issue.
3) I have a very small setup with 100+ clients, and it certainly doesn't work this way for me. 

There are some issues when a single server is up and in "communications interrupted" mode and you've got a tight IP pool and the leases were fairly evenly balanced against both servers. [I've posted, in the past, about an event that was kinda ugly for this client while running a 4.1 version [IIRC]. *However* those problems should be vastly less of a problem with 4.2+ - and you're not having an issue with communications interrupted anyway.

I am having an issue with communications interrupted. When I pull the ethernet cable, both the primary and secondary servers move from normal to  communications-interrupted.
But in your initial post on this thread you said: 

> "After both the primary and secondary
> had moved from communications-interrupted to normal"

It can't be both ways. Either they are CI, or in a Normal state. It can't be both.
Like I said, logs would probably be helpful. [Unless someone else has a lightening bolt moment and can tell you exactly what's wrong without them - but I doubt that.
I hope the logs above help.


---
You didn't answer the question: "It can't be both ways. Either they are CI, or in a Normal state."

But the logs you posted seem to indicate that both servers believe they're in "normal" mode when the DISCOVER	requests come in. Yet they only show the primary getting a single DISCOVER [the last one] while the secondary gets (4) over more than 30s.

Does a packet capture show the primary server actually receiving all the DHCPREQUEST packets that are going to the secondary?
It would be odd [typically a sign of lost packets] that the secondary would see (4) DHCPREQUEST's but the primary would only see one. It would be very helpful to know if those packets are actually arriving at the interface, but being ignored by dhcpd, or if they're not ever getting there at all. If it's the latter, it's probably not a dhcpd problem.

Searches on "lease in transition state expired" appear to find incidences of that output, as a result of bugs, but I could only find those kinds of problems recounted in *really* old, non-supported versions. [i.e. 3.0.x]

Are you running a currently supported version, and what version is it?


-Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20150625/c8cf8759/attachment.html>


More information about the dhcp-users mailing list