BIND 10 #2202: introduce a lock for the data source client in auth

Tue Oct 2 01:00:47 UTC 2012

#2202: introduce a lock for the data source client in auth
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:         |  Sprint-20121009
  medium                             |            Resolution:
                  Component:         |             Sensitive:  0
  b10-auth                           |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  5
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
  background zone loading            |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 First, I seem to have forgotten to push some suggested editorial fixes
 at the previous review.  I now pushed them after resolving conflicts.
 Please check them.

 Replying to [comment:23 vorner]:
 >
 > > The 1.67% drop in the average case is probably acceptable, but frankly
 > > I'd be concerned about the 5%-ish worse case overhead, especially
 > > because the lock should be unnecessary in the vast majority of cases.
 > > In future versions we should probably think about making the
 > > performance sensitive code lock-free, but for now, I think we should
 > > accept the overhead and complete the feature.
 >
 > I hope the non-debug build (there's a ticket for it somewhere) should
 improve
 > it significantly.

 I'm not so optimistic; I didn't see much overhead in the debug mode.
 (but replacing isc_throw with assert() may have a visible impact;
 apparently this macro can have substantial overhead for a supposedly
 simple operation).  But in any case, I'm okay with deferring
 performance concerns.

 > > - Mutex destructor: it's now exception free.  Same for ~Locker().
 >
 > Actually, the ~Locker can still throw. Should I keep the comment or make
 sure
 > it aborts too?

 If you don't mind and it's basically "should never happen' situation,
 I'd rather abort() to keep the usual expectation for destructor.

 > > - `Thread::wait()`: mostly a theoretical concern, but since this
 > >   method can throw, it can make the object a strange state.  For
 > >   example, consider a very rare case of new'ing `UncaughtException`
 > >   throws.  Then even if pthread_join() was called impl_ will be still
 > >   non NULL.  But I have no good idea to address this concern in a
 > >   concise and clean way.  Maybe it should be very minor that we can
 > >   ignore it.  In that case I'd add some note about it at least.
 >
 > I don't know if this covers completely everything, but I don't think
 there
 > could be other place this could go wrong now.

 It looks okay, but the intent doesn't look obvious, so please add some
 comments about why we catch exceptions.

 > > - swarm test: I was impressed your imagination:-) but
 > >   performStrangeOperation seems to be way too much complicated for the
 > >   purpose.  [...]
 >
 > When writing it, I spend quite some time trying to come up with
 something that:
 >  * The compiler shouldn't be able to optimize it to something atomic
 (like an
 >    increment on one integer which is atomic on many architectures).
 >  * Would keep some kind of invariant on the data when doing it
 correctly.
 >  * Would break with a high chance when accessing it in parallel.
 >  * Would not mess the memory completely, so we can safely detect
 something is
 >    wrong but not crash.

 Do compilers really use atomic operations (like xadd) for things like
 adding a value to integers?  That's against my previous experiences.
 But even so, I believe we can avoid the use of atomic operations more
 easily, e.g., by just using a bit more complicated types like double.
 I'm attaching some experimental diff.  It does not always "fail" but
 fails sufficiently often in my development environment (more than
 80%), and the code is much simpler.

 I'm afraid the current setup is too complicated and difficult to
 maintain.  If we can have a reasonably workable (even if not perfect)
 and simpler test case, I'd like to trade the reproducibility for that.

 > > - comment style is different than the expected one per guideline
 > > {{{#!cpp
 > >     /**
 > >      * \brief Return a mutex for the client lists.
 > >      *
 > > }}}
 > >   etc.
 >
 > But the /// ones break the \code snippets :-(. How is it done, then?

 Really?  How?  From a quick experiment the attached patch worked fine
 for me.  Maybe it's doxygen version issue?

 '''auth_srv_unittest.cc'''
 - About mutex test: the revised comment still doesn't make sense to me.
 {{{#!cpp
     // TODO: Once we have non-debug build, this one will not work, since
     // we currently use the fact that we can't lock twice from the same
     // thread. In the non-debug mode, this would deadlock.
     // Skip then.
 }}}
   Doesn't this case result in a failure from pthread_mutex_lock() due
   to the attempt of recursive lock (as we specify ERRORCHECK)?  Or is
   this another case where Linux behaves strangely?

-- 
Ticket URL: <http://bind10.isc.org/ticket/2202#comment:24>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development