BIND 10 #299: AXFR fails half the time
BIND 10 Development
do-not-reply at isc.org
Fri Oct 22 13:10:30 UTC 2010
#299: AXFR fails half the time
-----------------------------+----------------------------------------------
Reporter: zzchen_pku | Owner: zzchen_pku
Type: defect | Status: reviewing
Priority: major | Milestone:
Component: xfrout | Resolution:
Keywords: | Sensitive: 0
Estimatedhours: 0.0 | Hours: 0
Billable: 1 | Totalhours: 1.5
Internal: 0 |
-----------------------------+----------------------------------------------
Changes (by jinmei):
* owner: jinmei => zzchen_pku
Comment:
Replying to [comment:23 zzchen_pku]:
> Please find the new change in r3301. Appreciate your comments.
Hmm, more and more I'm familiar with the code, more and more questions
arise:-)
But I'll focus on the diff it self.
- test doesn't pass for me.
{{{
ERROR: test_check_xfrout_available (__main__.TestXfroutSession)
----------------------------------------------------------------------
Traceback (most recent call last):
File
"/Users/jinmei/src/isc/bind10/branches/trac299/src/bin/xfrout/tests/xfrout_test.py",
line 83, in setUp
self.xfrsess = MyXfroutSession(request, None, None, self.log)
TypeError: __init__() takes exactly 6 positional arguments (5 given)
(many more errors follow)
}}}
did you test it yourself?
- there seems to be a race condition about SESSION_RUNNABLE. see the
discussion for trac #352. But, even if we fixed the race, I wouldn't be
so much happy because the use of this variable scatters over the source
and it's very difficult to trace how exactly it works, much less be sure
whether it's correct. I see the need for more fundamental refactoring
here. (see also below).
- I would avoid using hardcoded value like "-2":
{{{
if fd == -2:
self._log.log_message("error", "Failed to receive the
file descriptor for XFR connection")
}}}
I'd define a constant in the fd_share library (both C++ and python
binding) and use that constant in the rest of the code.
- there are many redundant white spaces at EOLs or in blank lines, which
seem to be common in patches from CNNIC developers. You might want to
check your editor/IDE setting. (Or I suggest you write a checker script
and run it before you commit anything)
I've not reviewed all of the code yet because I have a more fundamental
issue (which is not directly due to this particular patch).
First of all, are we intentionally allowing multiple clients to connect to
the xfrout server via the unix domain socket? The current implementation
allows that because it inherits from !ThreadingUnixStreamServer. This
behavior might be good (e.g. if we use a multi-process model for b10-auth
in the future, each sub process may want to make a separate connection to
xfrout), but I'm not sure about the intent from the code. Actually, it
rather seems to assume a single client.
Second, since the actual xfrout session works in the blocking mode, we
cannot handle multiple xfrout sessions concurrently. This is not good if
we have multiple large zones and when xfrout sessions run for these zones
at the same time. And, of course, the shutdown procedure cannot (always)
be as quick as it should be.
Depending on the answer to the first point, we might rather avoid using
the socketserver framework. And, for the second part we need more
sophisticated operation for xfrout sessions.
Finally, despite the pretty complicated structure of code, xfrout is only
quite poorly documented. We need to provide fully detailed design
architecture and more detailed description for the classes and major
methods.
--
Ticket URL: <https://bind10.isc.org/ticket/299#comment:24>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list