Spontaneously shortened leases
gregs at sloop.net
Fri Jan 9 22:08:27 UTC 2015
This sounds like the servers went into communications-interrupted mode. IIRC, in CI mode, it will give out leases based on mclt. [I see someone else has already posted this.] Yet, it seems your logs don't show this is what happened.
Interestingly enough, I had a similar situation in the last week. One of our peers simply went AWOL. Could ping it, but couldn't ssh into it.
The symptoms were that some, but not all, clients were not getting IP addresses. Clearly it was a problem with the dhcp server.
Restarted the affected server, since I wasn't on site and couldn't attempt to look at the console and we needed it working again ASAP.
Logs were pfft though. From many hours before until the restart, there were no dhcpd logs to review.
On the server that stayed up, it thought the peer [the "down" box] was still up, and so it wouldn't lease any of the leases from the "down" server. [Because it thought it was still "communications-normal".]
Once the "down" server was restarted, they all of a sudden realized things were wrong, re-balanced and went back to communication-normal.
Then the "down" server re-leased all it's tied-up leases and all went back to normal.
[In the instances above, where I put "down" - I mean functionally down. Neither of the boxes appeared to actually recognize that the down peer was actually not working properly.]
I've not had time to go back and reexamine the logs to see if I find anything new/interesting - but I was rather surprised and baffled.
Your situation is, perhaps, somewhat the inverse of mine - but thought this anecdote might be helpful.
BTW: We're on 2.4.2 on Ubuntu 14.04, IIRC.
We are running a pair of 4.2.7 servers in failover mode. On Christmas, we had an issue with the
clients on the network they support and eventually discovered that the servers were handing out leases with 60-second lease times. As a result, the servers were swamped with DHCP requests and the clients were unable to consistently get and keep an address as the servers were dropping tens of thousands of packets a second as every device tried to renew.
It was noted that the normal "balancing" entries (for failover) in the dhcp logs were not present during this time.
The timestamps on the configuration files where the leases are configured were several months old and were still correctly configured for 24-hour leases.
We tried stopping DHCP on each of the servers for a time to see if the remaining server would take over without success. Ultimately, we had to restart one of the servers itself, not just its DHCP process, to recover.
Once that was done and the downed server came online, the balancing entries were visible in the logs once again and the leases were for 24 hours instead of 60 seconds. Has anyone else seen this problem? We have had no success determining what started this and would prefer to avoid it in the future as it took ten days to get all of clients straightened back out.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dhcp-users