Berkeley DB infinite sleep on Solaris [#3245]

Mon Jan 8 22:35:41 UTC 2001


> The Solaris man page (_lwp_cond_signal) is explicit:
>
>     "Both functions should be called under the protection of  the
>      same  LWP  mutex  lock  that  is used with the LWP condition
>      variable being  signaled. Otherwise, the condition  variable
>      may  be  signalled between the test of the associated condi-
>      tion and blocking in _lwp_cond_wait().  This  can  cause  an
>      infinite wait."

	This is flat-out incorrect. What it's suggesting is that the following can
happen:

Lock();
while(!Test_Predicate())
{
 <-- signal can occur here and be lost
 UnlockAndBlockThenRelock();
}
Act();
Unlock();

	True, that wakeup would be lost. But since the thread holds a lock on the
predicate, the value of the predicate is the same, hence the value of
'Test_Predicate' will be the same. Hence the thread, if it didn't lose the
wakeup, would just go back to sleep.

	Since the running thread holds the lock, no other thread can change the
predicate inbetween the 'Test' and the 'Block'. Thus the predicate cannot
have changed state inbetween the 'Test' and the 'Block'. So even if there
was a lost wakeup, the thread would still want to block because the
predicate is still false.

	The decision to block or not is based upon the predicate. So long as the
predicate does not change (and it can't because the thread holds a lock on
it), if it decided to sleep before the lost signal, it would just decide to
sleep again after it. So the lost signal simply saves a loop.

	DS