BIND 10 #335: b10-xfrout not exiting

Tue Oct 5 08:10:21 UTC 2010

#335: b10-xfrout not exiting
-------------------------+--------------------------------------------------
      Reporter:  jreed   |        Owner:  vorner   
          Type:  defect  |       Status:  reviewing
      Priority:  major   |    Milestone:           
     Component:  xfrout  |   Resolution:           
      Keywords:          |    Sensitive:  0        
Estimatedhours:  0.0     |        Hours:  0        
      Billable:  1       |   Totalhours:  0        
      Internal:  0       |  
-------------------------+--------------------------------------------------

Comment(by vorner):

 I have problems with failing make check here:
 {{{
 ...............F..........F........
 ======================================================================
 FAIL: test_run_timer (__main__.TestZonemgrRefresh)
 This case will run timer in daemon thread.
 ----------------------------------------------------------------------
 Traceback (most recent call last):
   File "/home/vorner/work/bind10/src/bin/zonemgr/tests/zonemgr_test.py",
 line 399, in test_run_timer
     self.assertFalse(listener.is_alive())
 AssertionError: True is not False

 ======================================================================
 FAIL: test_shutdown (__main__.TestZonemgrRefresh)
 ----------------------------------------------------------------------
 Traceback (most recent call last):
   File "/home/vorner/work/bind10/src/bin/zonemgr/tests/zonemgr_test.py",
 line 417, in test_shutdown
     self.assertFalse(listener.is_alive())
 AssertionError: True is not False
 }}}

 If I look at the test:
 {{{
 def test_shutdown(self):
         self.zone_refresh._check_sock = self.zone_refresh._master_socket
         listener = threading.Thread(target=self.zone_refresh.run_timer)
         listener.start()
         self.assertTrue(listener.is_alive())
         self.zone_refresh.shutdown()
         self.assertFalse(listener.is_alive())
 }}}

 It calls shutdown in one thread and checks that other thread already
 exited. But it does not have enough time to do so, the is_alive() is
 called too early (there's a race condition, running again produced only
 one failed test, not 2). You do wait for the thread event, but then,
 there's still the thread teardown (like exit from the function and such)
 and if a context switch happens on the event set (set is called, the
 thread which called shutdown is woken up and the one that was shutting
 down do not have time to exit the function).

 Furthermore, I do not like this (xfrin.py.in, somewhere after line 500):

 {{{
             try:
                 self._send_cc_session.group_sendmsg(msg,
 XFROUT_MODULE_NAME)
                 self._send_cc_session.group_sendmsg(msg,
 ZONE_MANAGER_MODULE_NAME)
             except:
                 pass
 }}}

 Not that we ignore all exceptions without doing something with them, this
 also catches all kinds of KeyboardInterrupt or SystemExit objects that are
 not proper exceptions. We should at last catch Exception, but, best of,
 catch only the one that killed msgq generates (and probably log a
 warning).

 As well as this test (zonemgr_test.py):
 {{{
         # test select.error by using bad file descriptor
         bad_file_descriptor = self.zone_refresh._master_socket.fileno()
         self.zone_refresh._check_sock = bad_file_descriptor
         self.zone_refresh._master_socket.close()
         self.assertRaises(None, self.zone_refresh.run_timer())
 }}}

 I think there shouldn't be None, but the exception that really is thrown,
 because the code can fail for some other reason and we shouldn't catch
 that.

 {{{
             if self._read_sock in rlist: # awaken by shutdown socket
                 break
 }}}

 As linux man page says in the Bugs section: „Under Linux, select() may
 report a socket file descriptor as "ready for reading", while nevertheless
 a  subsequent  read  blocks.“ If such spurious wakeup happens on the
 _read_sock, we should not exit. Note that continue instead of break would
 solve this, since self._runnable would be False if the real shutdown
 comes.

-- 
Ticket URL: <https://bind10.isc.org/ticket/335#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development