BIND 10 #2198: make sure InterprocessSyncLocker is thread safe
BIND 10 Development
do-not-reply at isc.org
Tue Sep 18 10:30:19 UTC 2012
#2198: make sure InterprocessSyncLocker is thread safe
-------------------------------------+-------------------------------------
Reporter: | Owner: muks
jinmei | Status: reviewing
Type: | Milestone:
defect | Sprint-20120918
Priority: | Resolution:
medium | Sensitive: 0
Component: | Sub-Project: Core
logging | Estimated Difficulty: 5
Keywords: | Total Hours: 0
Defect Severity: N/A |
Feature Depending on Ticket: |
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by vorner):
* owner: vorner => muks
Comment:
Replying to [comment:7 muks]:
> > First, the branch #2202 introduces a wrapped lock (and a thread, but
it shouldn't be needed here). I think we should not be dealing directly
with pthreads through the code, both because of exception safety and error
handling and because of portability. Is there any reason why you didn't
use the lock from there?
>
> I didn't know that there's a new lock class introduced in the #2202
branch. :) That is the first answer for why I didn't use that lock. For
the RAII style lock, #2202's locker seems relevant. Ignoring that the
branch is still in review, we still can't use this implementation here as
the interface provided by that lock class is insufficient. It's not
possible to unlock the mutex before its destruction (i.e., before that
basic block is exited). I have not read more into #2202.
Looking at it again, I think the SyncMapMutex is more or less the same as
the
Locker from #2202. I don't think we want to have two implementations of
the
same thing, we have more than enough code already. Could it be reused at
least
for this mutex?
For the rest, well, the interface of #2202 mutex is minimalistic, since I
didn't need more. But it could indeed be extended. From a longer
perspective, I
think it is better to have one complete interface than two half-complete.
> > Furthermore, the code seems to be doing no error checking whatsoever.
Most of the pthread functions can return error codes and they should be
handled. Also, the new or insertion to the map can throw. If that happens,
the mutex stays locked, causing a deadlock the next time anyone tries to
acquire it.
>
> I agree with the use of a RAII-style lock. A simple wrapper around a
pthread mutex has been introduced. All pthread calls are now checked. Even
then, I don't suppose our code can catch such exceptions and continue in
case the map throws in this codepath.
Well, it may or may not crash on these. It should be at least possible to
do
some kind of graceful shutdown or recovery by dropping the current query
and
trying again. Our policy says the classes should not get to invalid state
on
exception.
Which brings me to these two parts of code:
{{{#!c++
data = new SyncMapData;
}}}
{{{#!c++
sync_map.erase(it);
int ret = pthread_mutex_destroy(&data->second);
if (ret != 0) {
isc_throw(isc::InvalidOperation,
"Error destroying InterprocessSyncFile mutex: "
<< strerror(ret));
}
delete data;
}}}
In both cases, it can leak if the pthread function returns error. The
second
one could be solved by deleting data even before the erase. The first one
could
be done with something like auto_ptr.
However, these are all patches for fragile design I think. I'd like to
propose
an alternative approach:
* Use the mutex from #2202. That one tries to be very paranoid about
errors
and be exception safe. Also, it reuses current code and makes future
porting
to windows easier.
* Have one global mutex for the map. This can use the locker on stack in
the
usual way. You unlock it sooner than you exit the method sometimes, but
I
don't think it makes a big difference on how long it is held and if it
did,
there can be an inner block inside the method to invoke the destructor
sooner.
* Each task has a mutex (wrapped one, from #2202) allocated dynamically.
The
`InterprocessSyncFile` holds a shared pointer to it (so more than one
can have
the same one) and the map (which is protected by the above mutex)
contains a
weak_ptr to the mutex. This way unused mutexes get cleaned up
automatically.
* An `InterprocessSyncFile` contains a scoped_ptr which holds a Locker
when the
corresponding mutex is locked.
This has the advantage that whenever anything is lost due to exiting a
method
or exception or something like that, everything gets cleared and unlocked
automatically. Also, it could become slightly more readable, there'd be
less
error checking.
--
Ticket URL: <http://bind10.isc.org/ticket/2198#comment:8>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list