dhcp fails with big dhcpd.leases

dorian dorian33 at o2.pl
Wed Sep 1 09:54:36 UTC 2010


Simon Hobson wrote:
> dorian wrote:
>
>>  >> Looking at the above snippet:   host with MAC 7c:c5:37:21:d9:7c
>> asked
>>>>  several times for dhcp data.
>>>>  The first logs concerning this MAC which can be found are:
>>>>  Aug 31 12:54:03 [dhcpd] DHCPDISCOVER from 7c:c5:37:21:d9:7c via br0
>>>>  Aug 31 12:54:04 [dhcpd] DHCPOFFER on 172.18.93.227 to
>>>> 7c:c5:37:21:d9:7c
>>>>  via br0
>>>>
>>>>  It means the host haven't got IP.
>>>
>>>  But note also, it does NOT request the address.
>> It does. But I omit the request:
>> Aug 31 13:27:52 [dhcpd] DHCPREQUEST for 10.0.1.8 from 7c:c5:37:21:d9:7c
>> via br0: wrong network.
>> Aug 31 13:27:52 [dhcpd] DHCPNAK on 10.0.1.8 to 7c:c5:37:21:d9:7c via br0
>
> At last, we seem to be getting somewhere ! That client did not get
> that address from your server - at least with the config you posted.
> However, that is about 1/2 hour after the previous logs.
Excuse me for my ignorance regarding dhcp protocol but for me above logs
mean the client asked for IP=10.0.1.8 and the dhcp server refused it.
But did not get the correct (new, legal) one although it next asked
about many times.
>
>> I do not remember whole dhcp protocol. So I don't know what it is really
>> exchanged between client and server.
>
> As Sten says, you do need to know this if running a network
> professionally - sooner or later you will have a problem and you'll
> need to know details like that to see why/where it's going wrong.
It would be nice, of course. But my head has "limited RAM" and I cannot
to remember all the protocols in details concurrently.
DHCP server is a small part of the system.
And in my opinion the problem is concerned with the server itself (or
incorrect configuration).
After restarting the server (with no dhcp.leases file) it is working
around 1.5 month and next stops playing the role.
>
>> But according my -maybe naive consideration- the host should be able to
>> ask for a quite new IP without querying for assigning the "old" one.
>> Especially when it tries to get dhcp data connecting  totally fresh
>> network: there is no address to ask about.
>
> Yes, but if a client already has a lease which has not expired, then
> it may start the "conversation" by requesting the address it already
> has. If it's still on the same network then the server only has to ACK
> it - or if it's not valid for this network (as above), NACK it. After
> a NACK, the client has to stop using the address and start from
> scratch to get a new one.
>
> The full sequence is :
> Client - DHCP-Discover. Broadcasts a request for any DHCP servers to
> reply with offers of a lease.
> Server - DHCP-Offer. Broadcast an offer back to the client.
> <client then selects an offer, since there may be more than one DHCP
> server on the network>
> Client - DHCP-Request. Broadcast a request to use the address offered.
> <the server must now permanently record the lease in the database>
> Server - DHCP-Ack. Confirm to the client that the address is valid.
> <the client can now configure it's interface>
>
Clear. And in logs I can find the "offers" with new, correct address.
But for any reason it is not delivered, accepted or ignored by client.
There is no final "DHCP-Ack" from server.
But if I restart the dhcp server with no dhcp.lease file the client gets
IP immediately.
So in my opinion the problem is on the server side.
It cannot manage having a lot of active leases.
>
>>  > If there is no Request, then the server has nothing to Ack.
>> Ok. I undestand - the DHCPACK is posted only when the host asks about
>> the IP address and the IP is confirmed.
>>>  The ONLY request in that snippet is where 00:18:51:ce:b3:69 requests
>>>  172.27.140.7 but it is not a known lease. There isn't another instance
>>>  of that MAC address in the log you posted.
>>>
>>>  Now, why is it unknown ? Probably because you have broken your DHCP
>>>  server by deleting the leases file.
>> First of all - the core of the problem is:
>> a) when the dhcpd.leases became "big" the server stops serving DHCP data
>> (or clients don't received them)
>> b) stopping server, removing dhcpd.leases and starting server - fixes
>> the problem immediately
>> And this is the problem being the _main subject_ of my mails.
>>
>> The message exchange consideration is the results of my suspicions being
>> a result of my ignorance regarding the protocol.
>> BTW: I have never wrote that I just delete the lease file.
>
> I'm confused. At times you seem to suggest that you "stop the server,
> delete the leases file, start the server", while at other times it
> sounds more like you "delete the leases file without stopping the
> server". Can you confirm which it is ?
I know my english is not perfect, so there I believe this is
misunderstanding.
I wanted to write the the server should detect and "behave correct" if
the dhcp.lease file is deleted manually wile it is running.
But I never wrote that I did it (if I did - sorry, maybe shortening the
post I missed some steps)
>
>> And dhcpd.lease file remove is "legal" when server is not running, isn't
>> it ?
>
> NO, absolutely NOT.
> Your server has made promises to clients - ie "you can use <address>
> until <end time>". During the period of the leases given out, the
> client is given sole use of that address and it cannot be given to any
> other client. Once you delete the leases file and restart the server,
> that history of promises is lost and so the server cannot know what
> addresses are already in use on the network. It is likely to offer
> addresses to  clients that are already promised to other clients and
> so on. It also means that when a client returns to the network (after
> any leases it has have expired), the server has no record of the last
> address it had previously.
>
I understand the idea and what for is dhcp.leases file.
But it was _the only method_ i have found to get dhcp server functional
again.
Therefore I am posting to this mailing list - to fix the problem and to
keep the lease file untouched.

>>  >> Sorry. I do not understand.
>>>>  What is illegal or unusual with it?
>>>>
>>>>  172.16.8.0 belongs to 172.16.0.0/14
>>>>  and 172.16.0.0/14 is a part of 172.16.0.0/12 private class
>>>>
>>>>  So what does mean 'behave "funny".' ?
>>>
>>>  There is nothing illegal or funny, but it is known that a small number
>>>  of badly programmed clients cannot cope with the last octet being 0 or
>>>  255 since everyone "knows" that 0 is the network address and 255 is
>>>  the broadcast address.
>> ???
>> Do you know which ones? Windows? MacOS? Mobile OSes?
>> Quite new info for me! Detecting net on IP base only?
>> I've ever assumed that to get net & broadcast I need IP and mask.
>> Well, its very interesting...
>
> You are right, all it **should** need is IP address and netmask.
> However, some people just don't understand that and have written
> software (not just DCHP clients) on the assumption that .0 is a
> network address and .255 is a broadcast address ! No, I can't say what
> clients are involved - but it's not any of the major ones.
So, let's forget about the cases...
>
>>  >> I need such big IP range since in fact I have a network of hotspots
>>>>  working in bridge and centrally controlled from one host.
>>>
>>>  How many clients do you normally have on the network in any 2 hour
>>>  period ?
>> Daily I have about 60 client per point and it grows.
>> Now I have 10 points. The plans are to have up to 1000 points.
>>>  Looking at your original log snippet, you seem to have less than one
>>>  request per second. For 250,000 clients and a 2 hours lease, you
>>>  should be seeing not less than about 34 request-ack or
>>>  discover-offer-request-ack exchanges per second.
>>>
>>>  I'd suggest it's worth cutting back on the address space and see if it
>>>  makes a difference.
>>>
>> The lease time is 72000 not 7200 which gives 20 hours (in practice =
>> whole day).
>> The wide range of IP let me to assume that the same client (=same MAC)
>> will have same IP a day.
>> What is more with a high probability he will get the same IP in another
>> hotspot  next day(s).
>>
>> And differentiating between clients is very important for the business.
>
> So at the moment you only have about 600 clients.
600 clients per day. But each day it can be different "set" of the
clients. Per month this is >2000 different MACs.
> So try cutting down on the address space in the pool and see what
> happens. It's a significant variable, and unless you are prepared to
> try experimenting with changing such variables then we won't be able
> to help much. At the moment this is one that's under suspicion - so
> either disprove it or prove it, it's an easy change to make.
>
> Eg, change the range to "range 172.16.8.1 172.16.31.255" - that's
> still 6000 addresses which should be more than enough.

I've joined to this mailing list since I was looking for advice how to
fix my problem.
The problem could be concerned with incorrect configuration or a bug in
the server software.
If my config was incorrect I would be obliged is someone help me to fix it.

But if it is a bug in software - I am not able to find out the reason.
The source  code and the protocol itself are to reach to do it in
efficient way by me.
The problem I've met is repeatable.
If I place my "big" dhcp.leases file (I have a copy) in default place
and if I run the server with my config I will see that for a while (a
minutes) the server became dysfunctional i.e. new clients won't get IP.

The experiments you suggested above maybe gives some more information
about the limitation of the server but in my opinion it is not correct
proceeding way.
The program limitations should result from programmers assumptions 
rather that final investigations "let's check what we really coded and
what is possible"
So if there are established limitations - I would be obliged if someone
points them.

And let me not to experiment because I do not see what way the results
could help me in the problem.
The "knowledge" learned this way would be in my opinion very unreliable
(even for me).




More information about the dhcp-users mailing list