BIND 10 #299: AXFR fails half the time

BIND 10 Development do-not-reply at isc.org
Fri Oct 29 04:07:41 UTC 2010


#299: AXFR fails half the time
-----------------------------+----------------------------------------------
      Reporter:  zzchen_pku  |        Owner:  jinmei   
          Type:  defect      |       Status:  reviewing
      Priority:  major       |    Milestone:           
     Component:  xfrout      |   Resolution:           
      Keywords:              |    Sensitive:  0        
Estimatedhours:  0.0         |        Hours:  0        
      Billable:  1           |   Totalhours:  1.5      
      Internal:  0           |  
-----------------------------+----------------------------------------------
Changes (by zzchen_pku):

  * owner:  zzchen_pku => jinmei


Comment:

 Replying to [comment:24 jinmei]:
 > - test doesn't pass for me.
 Sorry, I have updated the unittest.
 >  - there seems to be a race condition about SESSION_RUNNABLE.  see the
 discussion for trac #352.  But, even if we fixed the race, I wouldn't be
 so much happy because the use of this variable scatters over the source
 and it's very difficult to trace how exactly it works, much less be sure
 whether it's correct.  I see the need for more fundamental refactoring
 here.  (see also below).
 Use shutdown_evnet instead of SESSION_RUNNABLE to eliminate race condition
 >  - I would avoid using hardcoded value like "-2":
 > {{{
 >                 if fd == -2:
 >                     self._log.log_message("error", "Failed to receive
 the file descriptor for XFR connection")
 > }}}
 >   I'd define a constant in the fd_share library (both C++ and python
 binding) and use that constant in the rest of the code.
 Done.
 >  - there are many redundant white spaces at EOLs or in blank lines,
 which seem to be common in patches from CNNIC developers.  You might want
 to check your editor/IDE setting.  (Or I suggest you write a checker
 script and run it before you commit anything)
 I haven't concern my editor setting before:-)thank you for your remind, I
 have removed the redundant white spaces
 > I've not reviewed all of the code yet because I have a more fundamental
 issue (which is not directly due to this particular patch).
 >
 > First of all, are we intentionally allowing multiple clients to connect
 to the xfrout server via the unix domain socket?  The current
 implementation allows that because it inherits from
 !ThreadingUnixStreamServer.  This behavior might be good (e.g. if we use a
 multi-process model for b10-auth in the future, each sub process may want
 to make a separate connection to xfrout), but I'm not sure about the
 intent from the code.  Actually, it rather seems to assume a single
 client.
 >
 > Second, since the actual xfrout session works in the blocking mode, we
 cannot handle multiple xfrout sessions concurrently.  This is not good if
 we have multiple large zones and when xfrout sessions run for these zones
 at the same time.  And, of course, the shutdown procedure cannot (always)
 be as quick as it should be.
 >
 > Depending on the answer to the first point, we might rather avoid using
 the socketserver framework.  And, for the second part we need more
 sophisticated operation for xfrout sessions.
 >
 > Finally, despite the pretty complicated structure of code, xfrout is
 only quite poorly documented.  We need to provide fully detailed design
 architecture and more detailed description for the classes and major
 methods.
 I think we need to create a new ticket for this:-), or this ticket will
 become more complex. I'll create one if you agree.

 Please find the new change in r3395, thank you.

-- 
Ticket URL: <http://bind10.isc.org/ticket/299#comment:25>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list