BIND 10 #2861: synchronization between auth main thread and datasrc builder

Wed Jul 10 15:52:37 UTC 2013

#2861: synchronization between auth main thread and datasrc builder
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  vorner
            Priority:  medium        |                       Status:
           Component:  b10-auth      |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130723
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  4             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  shared memory data source
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by pselkirk):

 * owner:  pselkirk => vorner

Comment:

 Replying to [comment:14 vorner]:
 > I added the documentation (I thought the original was mapping between
 the „first“ and „second“ names and the actual roles, and that the current
 names are self-explanatory).
 >
 > But, I don't see any optional arguments here.

 I use "optional" in a loose sense here. 'params' may be null, 'callback'
 may be null. I like to see that level of detail, but documentation varies
 widely across the code base.

 > > It seems that it would be better to throw an exception, which would be
 immediately caught and logged by the catch block. The program would still
 terminate, but with something like
 > > {{{
 > > terminate called after throwing an instance of 'isc::Unexpected'
 > >   what():  failed to wake up main thread
 > > }}}
 >
 > Would it? Generally, the `stderr` of the program is not shown to anybody
 anyway and throwing and catching it in the same function seems like badly
 masked `goto`, confusing the flow.

 That's the message I got on a quick hack of the code.

 If you think it would be better to terminate directly, I'll defer to you.
 The thing I don't like to see is a program dumping core, because that (to
 me) is a sign of poor error detection and recovery.

 > I believe the more important thing is logging, than `stderr` output. My
 version does that, and with more concrete log message than the proposed
 one. It's just that the logging is turned off during tests by default.
 >
 > However, I'd be interested in what exactly you did to trigger it. It
 might be useful to build a test case on top of that.

 To trigger the error condition, I just closed the socket. It's completely
 contrived, and I don't know if there's any way to trap the `terminate`, so
 I don't know if there's any practical way to build a test case around it.

 {{{
 diff --git a/src/bin/auth/tests/datasrc_clients_builder_unittest.cc
 b/src/bin/auth/tests/datasrc_clients_builder_unittest.cc
 index 53a1f2c..dec3bd1 100644
 --- a/src/bin/auth/tests/datasrc_clients_builder_unittest.cc
 +++ b/src/bin/auth/tests/datasrc_clients_builder_unittest.cc
 @@ -117,6 +117,7 @@ void emptyCallsback() {}
  TEST_F(DataSrcClientsBuilderTest, commandFinished) {
      command_queue.push_back(Command(SHUTDOWN, ConstElementPtr(),
                                      emptyCallsback));
 +    close(write_end);
      builder.run();
      EXPECT_EQ(0, cond.wait_count); // no wait because command queue is
 not empty
      // Once for picking up data, once for putting the callback there
 }}}

 >
 > > Finally, I understand the purpose of `FDGuard`, but I don't understand
 why you don't just put the `close()` calls in the `DataSrcClientsMgrBase`
 destructor, or in `cleanup()`. Why do you need an extra object for this?
 >
 > I might be wrong here. But AFAIK if an exception happens when calling
 one of the initializers in the constructor, the only the destructors for
 the already initialized members are called (in reverse order), but not the
 main destructor of the class. So, if `builder_thread_`, for example,
 threw, the `close()` wouldn't get called at all and both the file
 descriptors would leak. As the guard is initialized before the file
 descriptors, its destructor gets called even when the whole constructor
 did not run yet.
 >
 > I did not include this explanation in a comment in the code, because I
 believe there are many cases in our code like this. But if you think this
 case is more complex than usual, I can try writing it down there too.

 Okay, I didn't understand the complexity of cleaning up from a constructor
 failure. If this is a common pattern throughout the code base, I'm okay
 with not documenting it here. I just hadn't seen it before.

 Okay to merge.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2861#comment:15>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development