[kea-dev] Good news and bad news

Thomas Markwalder tmark at isc.org
Fri Nov 30 11:26:39 UTC 2018


We have a fundamental flaw in our "non-queue", default receiver logic. 
One that has been there forever.  We never notice it because we never 
test with traffic on more than one interface. Shame on us.  Here's the 
code at the heart of the issue:

In the regular, main-thread mode, we call IfaceMgr::receive4() which 
reads DHCP socket data with this block of code:


     // Let's find out which interface/socket has the data
     BOOST_FOREACH(iface, ifaces_) {
         BOOST_FOREACH(SocketInfo s, iface->getSockets()) {
             if (FD_ISSET(s.sockfd_, &sockets)) {
                 candidate.reset(new SocketInfo(s));
                 break;
             }
         }
         if (candidate) {
             break;
         }
     }

This reads and returns the first packet on the first ready interface.  
Now this works fine with one interface.  When there is more than one and 
they are all equally busy, the first ready socket we come to is the that 
gets serviced.  Because we always loop through them in the same order, 
if that interface is really busy it gets all the attention.  The rest 
starve.  To demonstrate this I ran two instances of perfdhcp against 
kea-dhcp4 with MySQL, and without packet queuing, configured with two 
two subnets, 175.0.0.0/8 and 178.0.0.0/8:

First interface declared in the config):
----------------------------------------
Running: perfdhcp -4 -r 500 -R 500000 -p 5 175.16.1.10
***Rate statistics***
Rate: 113.388 4-way exchanges/second, expected rate: 500

***Statistics for: DISCOVER-OFFER***
sent packets: 2161
received packets: 658
drops: 1503


Second interface declared in the config:
----------------------------------------
Running: perfdhcp -4 -r 500 -R 500000 -p 5 178.16.1.10
***Rate statistics***
Rate: 0.199951 4-way exchanges/second, expected rate: 500 <------- 
STARVED!!!!

***Statistics for: DISCOVER-OFFER***
sent packets: 2211
received packets: 1
drops: 2210


(I used a simple shell script and nohup to start perfdhcp instances at 
the same time).  What this means is that sites running Kea now with 
multiple sockets (interfaces or subnets), are probably having issues 
during high traffic conditions.


The good news is that the packet-queue receive logic is structured a bit 
differently. Looking at IfaceMgr::receiveDHCP4Packets(), which is used 
by the receiver thread:


         // Let's find out which interface/socket has data.
         BOOST_FOREACH(iface, ifaces_) {
             BOOST_FOREACH(SocketInfo s, iface->getSockets()) {
                 if (FD_ISSET(s.sockfd_, &sockets)) {
                     receiveDHCP4Packet(*iface, s);
                     // Can take time so check one more time the watch 
socket.
                     if (dhcp_receiver_->shouldTerminate()) {
                         return;
                     }
                 }
             }
         }

The function, receiveDHPC4Packet() pushes the packet on the queue, but 
rather than breaking on the first one, the loop continues reading
from ALL ready interfaces.  Running the same dual perdhcp test shows this:


First interface declared in the config):
----------------------------------------
Running: perfdhcp -4 -r 500 -R 500000 -p 5 175.16.1.10
***Rate statistics***
Rate: 51.3835 4-way exchanges/second, expected rate: 500

***Statistics for: DISCOVER-OFFER***
sent packets: 2172
received packets: 876
drops: 1296

Second interface declared in the config):
----------------------------------------
Running: perfdhcp -4 -r 500 -R 500000 -p 5 178.16.1.10
***Rate statistics***
Rate: 54.7949 4-way exchanges/second, expected rate: 500

***Statistics for: DISCOVER-OFFER***
sent packets: 2214
received packets: 838
drops: 1376


Notice the combined LPS is approximately 100 LPS, which matches the 
single-thread performance for the serviced interface.  In other words, 
my test setup can serve about 100 LPS, regardless of how many interfaces 
are involved.  With queuing we at least service all interfaces.

One of the things we really need to add to our testing, is multiple 
interface/socket scenarios.


Thomas


More information about the kea-dev mailing list