TCP backlog exceeded with bind 9.16.0

Craig Leres bind-workers-post at ee.lbl.gov
Mon Mar 16 17:56:14 UTC 2020


Since 9.14 reaches EOL soon, we recently switch to 9.16. Several times 
since then is has been necessary to restart the zone master named 
because it starts rejecting TCP connections (breaking zone transfers 
among other things). When this happens netstat shows a bunch of 
established connections with no pending data:

     Proto Recv-Q Send-Q Local Address          Foreign Address 
(state)
     tcp4       0      0 131.243.???.???.53     ???.???.???.???.2223 
ESTABLISHED

And also some non-zero receive queues in close_wait or closed states:

     tcp4      66      0 131.243.???.???.53     ???.???.???.???.37998 
CLOSED
     tcp4      66      0 131.243.???.???.53     ???.???.???.???.29349 
   CLOSE_WAIT

Finally, the TCP listen queue for named's public ip has more connections 
queued than the TCP backlog:

     Current listen queue sizes (qlen/incqlen/maxqlen)
     Proto Listen                           Local Address
     tcp4  16/0/10                          131.243.???.???.53

Which is why new TCP connections are being refused.

Is anybody else seeing this?

		Craig


More information about the bind-workers mailing list