BIND 10 #2400: notify auth::DataSrcClientsMgr when the builder thread dies

Wed Oct 24 07:19:05 UTC 2012

#2400: notify auth::DataSrcClientsMgr when the builder thread dies
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:
  jinmei                             |                Status:  new
                       Type:         |             Milestone:  Next-Sprint-
  defect                             |  Proposed
                   Priority:         |            Resolution:
  medium                             |             Sensitive:  0
                  Component:         |           Sub-Project:  DNS
  b10-auth                           |  Estimated Difficulty:  0
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by vorner):

 Hello

 Replying to [ticket:2400 jinmei]:
 > So the current implementation takes a harsh way: the builder thread
 > terminates the entire process by `assert(false)`.  This is still
 > suboptimal.

 Hmm, I didn't notice yesterday during the review, but is `assert(false)`
 the
 best way? Asserts can be turned off, this probably should have been
 `abort()`.
 Should we fast-fix this somehow?

 > - on unexpected termination, the builder sets it to false.  (this
 >   could be protected by a mutex, but that's not absolutely necessary)

 Is there a reason why this should be safe not to protect it by mutex? If
 the
 reason is that boolean is probably small, well, it doesn't sound like a
 very
 good reason for me. I heard that even the `sig_atomic_t` is not safe to be
 used
 this way.

 > - the manager checks variable every time it sends a new command to the
 >   builder or allow the application to access the client lists via the
 >   holder.  If the value has been changed to false, it takes an
 >   appropriate action (in practice, it would throw a fatal exception to
 >   terminate the process anyway).

 I'm not sure about this. So, on Monday an update to a zone happens and the
 thread crashes during the time. But the server happily runs with the old
 version of zone, because the server didn't exchange it. Than the next
 update
 happens on for example Friday and then the server unexpectedly crashes
 during
 an unrelated update, so the admin starts looking into what is wrong with
 the
 Friday update.

 What I think could work is send the main thread a message over the msgq,
 for
 example. Or signal it (though that's tricky). Or somehow notify it right
 away.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2400#comment:1>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development