BIND 10 #335: b10-xfrout not exiting
BIND 10 Development
do-not-reply at isc.org
Tue Oct 5 08:10:21 UTC 2010
#335: b10-xfrout not exiting
-------------------------+--------------------------------------------------
Reporter: jreed | Owner: vorner
Type: defect | Status: reviewing
Priority: major | Milestone:
Component: xfrout | Resolution:
Keywords: | Sensitive: 0
Estimatedhours: 0.0 | Hours: 0
Billable: 1 | Totalhours: 0
Internal: 0 |
-------------------------+--------------------------------------------------
Comment(by vorner):
I have problems with failing make check here:
{{{
...............F..........F........
======================================================================
FAIL: test_run_timer (__main__.TestZonemgrRefresh)
This case will run timer in daemon thread.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/vorner/work/bind10/src/bin/zonemgr/tests/zonemgr_test.py",
line 399, in test_run_timer
self.assertFalse(listener.is_alive())
AssertionError: True is not False
======================================================================
FAIL: test_shutdown (__main__.TestZonemgrRefresh)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/vorner/work/bind10/src/bin/zonemgr/tests/zonemgr_test.py",
line 417, in test_shutdown
self.assertFalse(listener.is_alive())
AssertionError: True is not False
}}}
If I look at the test:
{{{
def test_shutdown(self):
self.zone_refresh._check_sock = self.zone_refresh._master_socket
listener = threading.Thread(target=self.zone_refresh.run_timer)
listener.start()
self.assertTrue(listener.is_alive())
self.zone_refresh.shutdown()
self.assertFalse(listener.is_alive())
}}}
It calls shutdown in one thread and checks that other thread already
exited. But it does not have enough time to do so, the is_alive() is
called too early (there's a race condition, running again produced only
one failed test, not 2). You do wait for the thread event, but then,
there's still the thread teardown (like exit from the function and such)
and if a context switch happens on the event set (set is called, the
thread which called shutdown is woken up and the one that was shutting
down do not have time to exit the function).
Furthermore, I do not like this (xfrin.py.in, somewhere after line 500):
{{{
try:
self._send_cc_session.group_sendmsg(msg,
XFROUT_MODULE_NAME)
self._send_cc_session.group_sendmsg(msg,
ZONE_MANAGER_MODULE_NAME)
except:
pass
}}}
Not that we ignore all exceptions without doing something with them, this
also catches all kinds of KeyboardInterrupt or SystemExit objects that are
not proper exceptions. We should at last catch Exception, but, best of,
catch only the one that killed msgq generates (and probably log a
warning).
As well as this test (zonemgr_test.py):
{{{
# test select.error by using bad file descriptor
bad_file_descriptor = self.zone_refresh._master_socket.fileno()
self.zone_refresh._check_sock = bad_file_descriptor
self.zone_refresh._master_socket.close()
self.assertRaises(None, self.zone_refresh.run_timer())
}}}
I think there shouldn't be None, but the exception that really is thrown,
because the code can fail for some other reason and we shouldn't catch
that.
{{{
if self._read_sock in rlist: # awaken by shutdown socket
break
}}}
As linux man page says in the Bugs section: „Under Linux, select() may
report a socket file descriptor as "ready for reading", while nevertheless
a subsequent read blocks.“ If such spurious wakeup happens on the
_read_sock, we should not exit. Note that continue instead of break would
solve this, since self._runnable would be False if the real shutdown
comes.
--
Ticket URL: <https://bind10.isc.org/ticket/335#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list