request drops in BIND?

Fri May 14 01:01:15 UTC 2004

Hi,

First of all, I don't think BIND intentionally drops my requests as 
Barry says.
If it does, then it should leave that in the trace, but I couldn't find any.
(And there is no reason to do so as well.)

Rick Jones wrote:

>KyoungSoo Park <kyoungso at cs.princeton.edu> wrote:
>  
>
>>BIND's default recv buffer size is max(32KB, default buffer), and in
>>the box I tested, it's 64KB.  Each request is about 64 bytes
>>including header(UDP/IP), so the recv buffer should be big enough to
>>hold all requests.
>>    
>>
>
>That "depends" - the driver will, likely as not, have allocated a full
>link-MTU-sized buffer for the inbound DMA.  There is no a priori
>knowledge of an inbound packet size, so most simple card/driver models
>have a buffer of MTU sized being posted for DMA (yes, there can be
>more sophisticated schemes :)
>
>So, over Ethernet that could be a 1500ish byte buffer - and if the
>system is using a power-of-two allocator, that would be a 2048 byte
>buffer.
>
>All to hold your 64 byte request.
>
>Now... if the stack tracks not just actual data bytes, but also
>"overhead bytes" your 200, 64 byte requests could be counting as
>O(400KB) (200 * 2048) of space and could indeed be overflowing the
>socket buffer if they arrive in a spike faster than the named can
>process them.
>  
>
Well, that seems rather unconvincing. Even if the driver does sloppy 
allocation(really?),
why does the socket buffer have to contain the garbage in it? How does 
the kernel tell the actual
data from the garbage in the specific UDP buffer when we call recvfrom()?
I think even if the driver has a problem, the UDP socket should see only 
the actual data.
(garbage will be screened out when copying the data, if any)

Also, if you carefully look at the BIND source code(ns_main.c), you'll 
find that it has the place to drop
queries when it thinks it spent too much time(>3sec) for handling some 
"needs". In that place,
it starts to drop remaining queries after trying to handle 500 queries 
anyway. I think 500 is chosen
because its default receive buffer is 32KB, and 64 bytes * 500 = 32 KB. 
(although 64 * 512 = 32 KB, to be exact).

>Some stacks have statistics for UDP socket buffer overflows - you
>might look at netstat output for some variation on netstat options.
>That would also be a place to look for drops of the responses by the
>client program.
>
Thanks. That would be really helpful. I'll look for that.

>
>I suppose if you wanted to eliminate socket buffer overflow as a
>possiblity, you might make the default on your system ridiculously
>large - say 1 GB or something.  Then run your test again.
>  
>
But,  I don't think setsockopt() would even succeed with that size.

>Drifting a bit since you asserted that tcpdump showed all the requests
>arriving at the server... but for the sake of completeness...
>
>...there can be limits on the outbound transmit queues for the NIC.
>Ostensibly, there should be a way to retrieve link-level statistics
>from your systems to see if perhas the transmit queues have been
>overflown/overflowed.
>  
>
Our switch is 100 Mbps, and the client and the server ran on the same 
segment.
The data speed that I measured from the tcpdump timestamp shows it's 
just more than 2 Mbps of speed.
Given the fact that PING had no problems, I don't think there was any 
overflow from the switch.

-----------------------------------------------------------------------------------------------------------------
But here is an important issue. If 178 bulk requests lead to no response 
from BIND sometimes,
then can we say BIND is doing well?
I think the speed of 178 requests is happening every now and then in a 
reasonable sized community.
It becomes problematic to run an application which does very sensitive 
jobs depending on BIND's performance.

Thanks,
KyoungSoo