dhcp 3.0.5 crashes with lots of lease imbalance messages

Mon Dec 14 03:29:45 UTC 2009

>Date: Mon, 14 Dec 2009 10:56:48 +0800
>From: "K. K. Tam" <k.k.tam at ust.hk>
>To: dhcp-users at lists.isc.org
>Subject: Re: dhcp 3.0.5 crashes with lots of lease imbalance messages
>X-BeenThere: dhcp-users at lists.isc.org
>
>Those pools with short lease time has a total of 5240 and the clients is 
>around 50%.  The volume of DHCPACKs processed per minute before the 
>failure varied and it was close to zero most of the time because it was 
>flooded with lease imbalance messages.  Some samples showed that it 
>ranged from 60 to 130 and dropped to single digit or 0 for a few minutes 
>(1 to 5) and it crashed.  We have a cron job running every 5 minutes to 
>monitor the dhcpd and restarts it in case it crashes.

MCLT generally needs to be set to a lower value than the lease time in
a given pool. Part of the reason for this is that MCLT is the number of
seconds that the server will hand out initial leases for when doing
failover. It doesn't seem sensible for this to be larger than the
normal lease time. For 30 minute leases maybe 10-15 minutes for MCLT
would be reasonable.

In later versions (3.1.0 onwards) there are a few commands to vary the
algorithm. In particular min-balance can be used to only send balance
requests at minimum intervals, rather than after every lease allocation
as happens in 3.0.x.

        max-lease-misbalance percentage;
        max-lease-ownership percentage;
        min-balance seconds;
        max-balance seconds;

>Do you have any suggestions on how we can diagnosis our problem 
>further?  For the consideration of the upgrade, is there any 
>compatibility issue on the dhcp configuration file and the lease file, 
>say upgrading to v3.1.3 ?

The dhcpd.conf and dhcpd.leases files are compatible. To double check
you can run the new version of dhcpd with the -t and -T files to test
the dhcpd.conf and dhcpd.leases files for syntax.

regards,
-glenn
--
Glenn Satchell                            |  Miss 9: What do you
Uniq Advances Pty Ltd, Sydney Australia   |  do at work Dad?
mailto:glenn.satchell at uniq.com.au         |  Miss 6: He just
http://www.uniq.com.au tel:0409-458-580   |  types random stuff.

>
>Thanks.
>K.K.
>
>
>On 12/12/2009 2:22 PM, Matt Causey wrote:
>> How big are the pools?  How many clients total?  What volume of 
>> DHCPACKs per minute are you processing at the time of failure?
>>
>> Hate to say it, but you should consider trying a newer version of the 
>> server.  I'm sure there is some diagnosis you can try with what you 
>> are running, but there have been a lot of bugfixes in the last 3 years 
>> since that release.
>>
>> -- 
>> Matt
>>
>> On Dec 11, 2009, at 0:59, "K. K. Tam" <k.k.tam at ust.hk> wrote:
>>
>>> Hi,
>>>
>>> Our site is running dhcp 3.0.5 in failover and the primary and 
>>> secondary servers are with CentOS 4.
>>> The service has been very stable since its rollout in 2007.  A few 
>>> months ago, we have set up a few
>>> IP pools with relatively short lease time from 15 minutes to 30 
>>> minutes.  Then the dhcpd in the
>>> primary server would crash from time to time with lots of "lease 
>>> imbalances - lts" messages as
>>> below.
>>>
>>> Dec 11 01:33:55 <hostname> dhcpd: lease imbalance - lts = 23
>>> Dec 11 01:33:55 <hostname> dhcpd: lease imbalance - lts = 17
>>> Dec 11 01:33:55 <hostname> dhcpd: lease imbalance - lts = 3
>>>
>>> The number of these messages ranges from 800 to 1000 before the 
>>> daemon crashes.  It is observed
>>> that crash happens as the usage in these pools increases.  The MCLT 
>>> in our set up is 1800.  Our
>>> current workaround is to extend the lease time of these pools to 30 
>>> minutes and 2 hours with
>>> the trade off of IP usage.  The situation improves a bit but still 
>>> has crashes from time to time.
>>> The worst case we had before extending some to 2 hours lease time was 
>>> crashes around every 5 to
>>> 10 minutes for the whole afternoon till the usage slowed down.  We 
>>> have about 12000 available
>>> IPs and the active usage is around 60%.
>>>
>>> I have searched the archive of the mailing list and the problem 
>>> happened for some previous version
>>> yet there was no concrete solution.  I wonder if anyone can provide 
>>> some suggestions or solutions for
>>> this problem.
>>>
>>> Thanks in advance.
>>>
>>>
>>> Regards,
>>> K. K. Tam
>>>
>>> Email: k.k.tam at ust.hk
>>>
>>> _______________________________________________
>>> dhcp-users mailing list
>>> dhcp-users at lists.isc.org
>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>> _______________________________________________
>> dhcp-users mailing list
>> dhcp-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/dhcp-users
>_______________________________________________
>dhcp-users mailing list
>dhcp-users at lists.isc.org
>https://lists.isc.org/mailman/listinfo/dhcp-users