[Kea-users] Help diagnosing (and potentially addressing) a possible performance problem?
regj at arch-ed.dk
Thu Oct 5 09:02:28 UTC 2017
I have seen something very similar on vmware with another application
receiving a lot udp traffic and unfortunately we never found a solution
for it and switched to bare metal as a workaround, which has irked me
ever since and I'm interested in finding a root causes for these kinds
As far as I understand, and according to the netstat man page, Recv-Q is
the count of bytes not yet copied by the user program connected to the
Do you have special rules, execute something or do dns lookups when
handling dhcp requests?
Have you read the comments on ALLOC_ENGINE_V4_ALLOC_FAIL?
"% ALLOC_ENGINE_V4_ALLOC_FAIL %1: failed to allocate an IPv4 address
after %2 attempt(s)
The DHCP allocation engine gave up trying to allocate an IPv4 address
after the specified number of attempts. This probably means that the
address pool from which the allocation is being attempted is either
empty, or very nearly empty. As a result, the client will have been
refused a lease. The first argument includes the client identification
This message may indicate that your address pool is too small for the
number of clients you are trying to service and should be expanded.
Alternatively, if the you know that the number of concurrently active
clients is less than the addresses you have available, you may want to
consider reducing the lease lifetime. In this way, addresses allocated
to clients that are no longer active on the network will become
Klaus Steden skrev den 2017-10-05 03:03:
> Hi everyone,
> We've been using Kea successfully for several months now as a key part of our provisioning process. However, it seems like the server we're running it on (a VM running under XenServer 6.5) isn't beefy enough, but I'm not 100% confident in that diagnosis.
> There are currently ~200 unique subnets defined, about 2/3rd of which are use to provide a single lease during provisioning, at which point the host in question assigns itself a static IP. There are 77 subnets that are actively in use (for IPMI), with the following lease attributes:
> "valid-lifetime": 4000,
> "renew-timer": 1000,
> "rebind-timer": 2000,
> From what I'm seeing in the output of tcpdump, there are a LOT more requests coming in than replies going out, and _netstat_ seems to confirm that:
> # netstat -us
> 71774 packets received
> 100 packets to unknown port received.
> 565 packet receive errors
> 4911 packets sent
> If I monitor _netstat_ continuously, I see spikes on the RecvQ for Kea that fluctuate wildly, anywhere between 0 and nearly 500K (and sometimes higher) moment to moment.
> The log also reports a lot of ALLOC_ENGINE_V4_ALLOC_FAIL errors after typically 53 attempts (not sure why 53, but that number seems to be the typical upper limit before failure is confirmed).
> I've been experimenting over the last hour or so with tuning various kernel parameters (net.ip4.udp_mem, net.core.rmem_default, net.core.netdev_max_backlog, etc.) but those don't appear to make any kind of difference, and the RecvQ remains high.
> Is there any way I can either tune the daemon to handle this kind of backlog, or a list of which kernel tuneables I should be looking at modifying? Is there a more clear way to determine if I've got a genuine performance limitation that we're just now running into?
> I've got a bare metal machine temporarily helping carry the burden and it doesn't have these issues, but then again, it's not carrying the full load; I'm loath to dedicate a whole physical server just to DHCP, but if the load is going to remain high like this, maybe that's just what I have to do.
> Kea-users mailing list
> Kea-users at lists.isc.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Kea-users