BIND 10 #400: Race condition in zonemgr

BIND 10 Development do-not-reply at isc.org
Fri Nov 5 03:14:18 UTC 2010


#400: Race condition in zonemgr
-------------------------------+--------------------------------------------
      Reporter:  vorner        |        Owner:  UnAssigned
          Type:  defect        |       Status:  reviewing 
      Priority:  major         |    Milestone:            
     Component:  Unclassified  |   Resolution:            
      Keywords:                |    Sensitive:  0         
Estimatedhours:  0.0           |        Hours:  0         
      Billable:  1             |   Totalhours:  0         
      Internal:  0             |  
-------------------------------+--------------------------------------------

Comment(by jinmei):

 Replying to [ticket:400 vorner]:
 > The auto-builder failures on FreeBSD8-i386 might be related to a race
 condition in zone manager. It is possible for the code inside the while
 loop not to run at all. I propose this simple patch to try and fix it. I
 think it is too small to be included in changelog. Do you think it is OK?
 >
 I agree your proposed change will likely fix the problem, but I'm afraid
 that it's difficult to understand the real intent of the new code.  That
 is, what this code does is to set start_event only once, on the first
 iteration of the while loop, but only when self._running is True.  Now
 that we've known this bug we may be able to guess this is somehow related
 to a race between _running and start_event, but I suspect it's unclear to
 people who read this code without having such background.

 One possible way to address this concern is to add comments about why we
 do this, but I personally would like to offer a "simpler" solution: run
 the while loop unconditionally and always return from this method once we
 receive a message from the "shutdown socket" (= self._read_sock).  I'm
 attaching a proposed diff to the ticket.

 According to the code comment _running seems to be used to detect and
 reject a possible "false alarm" delivered to _read_sock, but IMO it's much
 better not to rely on a shared variable between threads than worrying
 about such a very rare error case.

 What do you think?

-- 
Ticket URL: <http://bind10.isc.org/ticket/400#comment:2>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list