BIND 10 #299: AXFR fails half the time
BIND 10 Development
do-not-reply at isc.org
Fri Nov 5 05:03:45 UTC 2010
#299: AXFR fails half the time
-----------------------------+----------------------------------------------
Reporter: zzchen_pku | Owner: jinmei
Type: defect | Status: reviewing
Priority: major | Milestone:
Component: xfrout | Resolution:
Keywords: | Sensitive: 0
Estimatedhours: 0.0 | Hours: 0
Billable: 1 | Totalhours: 1.5
Internal: 0 |
-----------------------------+----------------------------------------------
Changes (by zzchen_pku):
* owner: zzchen_pku => jinmei
Comment:
Replying to [comment:34 jinmei]:
> okay, confirmed. I've noticed there are obsolete comments (due to the
change of this branch) in the test code. I've directly made a change on
the branch to remove them (r3435).
Thanks, Jinmei.
> - I don't understand why we need to check _shutdown_event in
!XfroutSession.handle():
> {{{
> while not self.server._shutdown_event.is_set(): # Check if
xfrout is shutdown
> }}}
> when shutdown happens, shouldn't the !XfroutSession be able to detect
that via the _shutdown_sock?
I remember vorner mentioned before:
{{{
As linux man page says in the Bugs section: „Under Linux, select() may
report a
socket file descriptor as "ready for reading", while nevertheless a
subsequent
read blocks.“ If such spurious wakeup happens on the _read_sock, we should
not
exit. Note that continue instead of break would solve this, since
self._runnable
would be False if the real shutdown comes.
The spurious wakeup does not happen usually, as I gather it happens only
under
extreme conditions (high load, bad checksum on incoming packet) and it is
considered a bug.
}}}
Just to ensure the real shutdown comes.
> - when EINTR happens, recv_fd() will be called even if it may not be
readable.
> - why can't this be log but stderr?
> {{{
> sys.stderr.write("[b10-xfrout] Error with select():
%s\n" %e)
> }}}
Done.
> - (this is not specific to this patch but anyway) I'm not sure if it's
guaranteed all data is received here:
> {{{
> msgdata = self.request.recv(msg_len)
> }}}
> isn't it possible (at least in theory) that it's a partial read?
The xfrout request message shouldn't be too large, but in order to avoid
partial read, have updated the recv logic.
> - naively assuming IPv4 is not good:
> {{{
> sock = socket.fromfd(fd, socket.AF_INET, socket.SOCK_STREAM)
> }}}
> although as far as I know the second parameter of fromfd will be
simply ignored.
> - I suspect the naming of "_{master,slave}_sock" is confusing:
> {{{
> self._master_sock, self._slave_sock = socket.socketpair()
> }}}
> because it may look like related to the notion of DNS master (primary)
and slave (secondary) while it is not.
>
> '''fd_shar_python'''
> - I was not sure what "REV" in "XFR_FD_REV_FAIL" means (receive?). I'd
suggest not to be afraid of using relatively long variable name if it
clarifies the intent.
> - using static this way is not C++-ish:
> {{{
> static PyObject *XFR_FD_REV_FAIL;
> }}}
> as far as remember this usage is actually even "deprecated".
(although this type of cleanup may better be deferred to a different
ticket)
> - also please add documentation of this constatnt (at least in C++, and
if possible in the python wrapper)
> - error cases should be handled properly here:
> {{{
> XFR_FD_REV_FAIL = Py_BuildValue("i", isc::xfr::XFR_FD_REV_FAIL);
> PyModule_AddObject(mod, "XFR_FD_REV_FAIL", XFR_FD_REV_FAIL);
> }}}
Updated.
> > I think we need to create a new ticket for this:-), or this ticket
will become more complex. I'll create one if you agree.
> For that matter, I've noticed test coverage of xfrout is not really good
(my test showed only 57% of the code is covered). We'll also need a
separate task (ticket) to improve test coverage.
Please refer to #406 and #407.
--
Ticket URL: <http://bind10.isc.org/ticket/299#comment:35>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list