BIND 10 #657: listen_on for b10-auth leads to 100% CPU usage

BIND 10 Development do-not-reply at isc.org
Fri Mar 4 16:39:28 UTC 2011


#657: listen_on for b10-auth leads to 100% CPU usage
-------------------------------------+-------------------------------------
                 Reporter:  jinmei   |                Owner:  UnAssigned
                     Type:  defect   |               Status:  reviewing
                 Priority:  blocker  |            Milestone:  A-Team-
                Component:           |  Sprint-20110309
  b10-auth                           |           Resolution:
                 Keywords:           |            Sensitive:  0
Estimated Number of Hours:  0.0      |  Add Hours to Ticket:  0
                Billable?:  1        |          Total Hours:  0
                Internal?:  0        |
-------------------------------------+-------------------------------------
Changes (by vorner):

 * owner:  vorner => UnAssigned
 * status:  accepted => reviewing


Comment:

 Hello

 The problem was caused by a loop in the tcp and udp server which retried
 accept/recv on error. That caused a busy loop, because when the original
 socket was closed in update, it generated error on it's accept and it kept
 retrying on the closed socket.

 I added a condition to allow only some errors, which seemed to me to be
 non-fatal. However, I have no idea how to test this (because even with the
 bug, the server did work, it only kept being busy) and the asio
 documentation is quite sparse to the point how to recognize the errors
 (Stephen suggested that it probably contains errno, which makes sense, but
 how to test that too?).

 The branch contains one unrelated change ‒ I noticed a nop call when
 debugging this, so I removed it.

 Anyway, we might want to move away from coroutines in the code and maybe
 we should revisit the set of allowed errors some time. But this seems to
 solve the 100% CPU problem and I tested the server answers queries, so I'm
 putting it to review this way.

 Proposed changelog entry:
 {{{
 [bug]      vorner
 Solved a 100% CPU usage problem after switching addresses in b10-auth
 (and possibly, but unconfirmed, in b10-resolver). It was caused by
 repeated
 reads/accepts on closed socket.
 }}}

-- 
Ticket URL: <http://bind10.isc.org/ticket/657#comment:4>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list