ctrace.c error / failover "peer holds all free leases"

Matt Causey matt.causey at gmail.com
Thu Apr 9 20:08:38 UTC 2009


There is no timer that I can find in the source, or from operational experience.

Our sites have 2 dhcp servers in the same rack - sometimes 1 fails.
If the failure lasts long enough eventually 1/2 the clients start
dying because they cannot get leases.

The design concept is, that perhaps server A cannot talk to server B.
But server B is still handing out leases somewhere.  In this scenario,
if server A starts scavenging leases, it could lead to duplicate IP
addresses on the network.

Personally, I'd rather assume that this scenario won't happen on my
topology - so I've got a cron which checks the status, and if a server
is alive enough to run crons, and is in communications-interrupted for
longer than a few minutes, then it automatically places the server in
partner-down.

Is there any way that such a timer could be added as a configurable
option?  Conceptually - would a patch with this functionality be
accepted by ISC?

Cheers,

--
Matt

2009/3/26 Foggi, Nicola <NFOGGI at depaul.edu>:
>
> is there a timer that takes it from communications-interrupted to partner-down state?  it also appears that in communications-interrupted the reserved leases may not be honored.  Of course i didn't have debugging enabled, so can't tell exactly what happened, but the primary server (which was up) leased a random new ip to the client vs the reserved ip... back to the source code, ughh...
>
> maybe reserved leases and failover together aren't ready for production yet :(
>
> Nicola
>
> -----Original Message-----
> From: dhcp-users-bounces at lists.isc.org on behalf of Foggi, Nicola
> Sent: Thu 3/26/2009 11:36 AM
> To: dhcp-users at lists.isc.org
> Subject: ctrace.c error / failover "peer holds all free leases"
>
>
> running 3.1.2b1
>
> received this message:
>
> ctrace.c(168): trace_write_packet: short write (407:596)
>
> on one of the servers in a failover pair, caused the server to stop running, any ideas what it means?  The other problem is then the primary server started giving "peer holds all free leases" messages.  Our failover has:
>
>    mclt 900;
>    split 255;
>    load balance max seconds 3;
>
> so the "primary" server should have all leases to hand out if i read the documentation correctly, but turning on "DEBUG_FIND_LEASE" returned:
>
> dhcpd: Not returning a lease.
> dhcpd: DHCPDISCOVER from 00:1c:b3:62:e7:97 via 10.99.24.1: peer holds all free leases
>
> this is after more than 24 hours of the secondary server being down... any one else see this?
>
> Nicola
>
>
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users
>



More information about the dhcp-users mailing list