DHCPD Stopping (but not with v 3.1.2 ...)

Dean, Barry B.Dean at liverpool.ac.uk
Fri Mar 20 15:49:54 UTC 2009


Just ran my experiments against version 3.1.2 of DHCPD.

I see the same symptoms, except the server keeps going.

What it looks like is the client sends a uni-cast DHCPREQUEST to the server it got its dynamic IP from. The server wants to NAK it as it has a new fixed IP ready for it. The server can’t be sure it can uni-cast the response as the IP it is asking for could be from a different subnet (it’s not in this case, but could be), so broadcasts it. It’s no use though as the NAK does not reach the client. The difference between versions comes down to:

v 4.0.0
                One uni-cast REQUEST hits the server, one broadcast NAK is sent – Server stops. Chaos ensues. Boss comes in to office!

v 3.1.2
                8 uni-cast REQUESTS hit the server, 8 broadcast NAKs are sent – Server still responds.

Then (with v 3.1.2 only as v 4.0.0 has stopped by now!), the client sends a BROADCAST DHCPREQUEST for the dynamic  IP, this is picked up by the router and relayed, the server now has a relay to direct the NAK at, which reaches the client, which immediately drops that IP and goes for a DISCOVER and gets sent its fixed address.

Can we have v 4.* behave like V 3.1.2 please? then boss will not keep coming in to my office asking “have you fixed it yet!” :)

I don’t want to have to “downgrade” to v 3.1.2, but may have to. If I do, anyone got any horror stories of the leases file getting wrecked? Seemed to work on my test server (7 leases), but nervous about letting it loose on my production server (25,000 “lease” entries on 2 pairs of servers when I just checked!).

---------------
Barry Dean
Networks Team
http://pcwww.liv.ac.uk/~bvd/

From: dhcp-users-bounces at lists.isc.org [mailto:dhcp-users-bounces at lists.isc.org] On Behalf Of Dean, Barry
Sent: 20 March 2009 13:02
To: Users of ISC DHCP
Subject: RE: DHCPD Stopping

Thanks Jane, this is very interesting. It helps. We were running an ancient release candidate version of 3 point something on NetBSD 3.1.

I figured out why the server’s NAK was broadcast, it was because the REQUEST had come in from the interface with no “giaddr” set,  that is to say the REQUEST was logged as:

DHCPREQUEST on <client IP> from <client MAC> via e1000g0: lease <client IP> unavailable.

I fail to see how this could have happened as the client is on 192.168.22.0/24 subnet and the server is on 192.168.5.32/29. The router must have relayed the packet, so why is giaddr missing from the incoming REQUEST? Did the router forget to put it in there, or did the server lose it? :-)

On a test linux PC, I killed off the automatic network manager and ran “dhclient” manually, repeated the experiment and did not get the problem. That time the REQUEST was logged as:

DHCPREQUEST on <client IP> from <client MAC> via <router IP>: <client IP> lease unavailable.

In this case the NAK was seen by the client and it dropped the dynamic IP and re-DHCPed and picked up the fixed one with no problems. This was using the ISC “dhclient”.

I think I will repeat the experiments and try with V 3.1.0 as well, as this problem seems to be in 4.0.0 and 4.1.0 so I assume it is in 4.0.1 as well.

---------------
Barry Dean
Networks Team
http://pcwww.liv.ac.uk/~bvd/

From: dhcp-users-bounces at lists.isc.org [mailto:dhcp-users-bounces at lists.isc.org] On Behalf Of Jane Zuzek
Sent: 20 March 2009 12:37
To: Users of ISC DHCP
Subject: Re: DHCPD Stopping

I'm sorry I can't offer any solutions, but I'll note that we experienced similar problems last month when we attempted to upgrade from dhcp 3.1.0 to version 4.1.0 running on Solaris 10.  We ended up reverting to version 3.1.0 after about 2 days, so I didn't gather a lot of data.  We're running in a single server environment, with no failover configured.

We experienced two notable issues:

1) Assigning a static address to a formerly-dynamic client caused DHCP services to stop responding, as detailed by Barry.
2) Many statically-defined clients on the same network as the DHCP server could cause services to hang, simply by a DHCPREQUEST (renew lease).  We found that these were frequently Macintosh systems, although as I said, we didn't stay at version 4.1.0 long enough to gather a lot of data.

When we encountered our problems, the dhcpd daemon would continue to run, but simply didn't seem to be responding to requests.   At one point, when the daemon was failing to respond to requests, I noted in the logs that it rotated out the dhcpd.leases file as part of its normal hourly operation.  So it was still performing some functions.

Although I was able to write a script to run through cron to watch for DHCP services becoming unresponsive (ie, watch for lack of updates to the dhcpd.log file) and programmatically restart services, this didn't resolve the problem of formerly-dynamic clients picking up newly assigned static addresses.  So we reverted back to version 3.1.0.

   Jane Zuzek

Dean, Barry wrote:

I am still having the occasional problem with DHCP ceasing to answer requests from clients.



I have traced it to a reproducible situation.



Client (only seen with Linux and HP Printers at the moment) is sitting on a dynamically allocated IP from the pool and all is happy.



We allocate a fixed IP to the MAC address by editing in a "host {}" entry into the config and restart the master and slave server in the failover pair.



Client does a DHCPREQUEST for the dynamic IP, the server issues a DHCPNAK.



The client and server are on separate subnets with Cisco 6905E routers doing the dhcp relaying.



The interesting thing is that I see lots of DHCPNAKs, most do not cause a problem. Looking in the logs the "safe" NAKS look like:



DHCPNAK on <client IP> to <client MAC> via <router IP>



But the server stops serving clients as soon as it issues a NAK that is logged as:



DHCPNAK on <client IP> to <client MAC> via e1000g0



e1000g0 being the interface on the DHCP server (a Sun X4200).



This looks like DHCPD had broadcast the NAK instead of unicasting it to the router for relaying. The upshot is that the client does not see the NAK and keeps on using the dynamic IP it has.



This is causing us a bit of trouble at the moment as we have printers and PCs being moved all over the shop and we are losing the DHCP service several times a day.



What makes the server broadcast the NAK on its own local subnet? How can I stop it? I looks like a client issue, but the server's way of dealing with it is a cause for concern...



As the leases on our dynamic pools are 8 days, it can be a while after you issue the fixed address that the server dies. If it happens over a weekend we lose services. Our wireless access points seem particularly sensitive and we have lost wireless everywhere on one occasion.



Any ideas, I am getting hassle over this! Any help much appreciated.



Thanks.

---------------

Barry Dean

Networks Team

Computing Services Department

Tel: 0151 794 5641 (x45641), Web: http://pcwww.liv.ac.uk/~bvd/

---

Nice boy, but about as sharp as a sack of wet mice.

                -- Foghorn Leghorn





_______________________________________________

dhcp-users mailing list

dhcp-users at lists.isc.org<mailto:dhcp-users at lists.isc.org>

https://lists.isc.org/mailman/listinfo/dhcp-users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/dhcp-users/attachments/20090320/b763a674/attachment.html>


More information about the dhcp-users mailing list