Berkeley DB infinite sleep on Solaris

Kiernan, Alex alexk at demon.net
Mon Jan 8 17:03:02 UTC 2001


Running inn (from CVS) on Solaris we were seeing regular infinite sleeps
during expire. The Berkeley DB code was sleeping on _lwp_cond_wait due to
apparent lost wakeup from _lwp_cond_signal, this patch appears to fix it -
or at least we've run for a week without it dieing (plus a couple of other
Solaris sillies). In fact I can't think why this wouldn't affect every
platform.

--- orig/db-3.2.3f/mutex/mut_pthread.c	Thu Oct  5 18:16:17 2000
+++ db-3.2.3f/mutex/mut_pthread.c	Wed Jan  3 16:03:04 2001
@@ -38,7 +38,7 @@
 #define	pthread_mutex_trylock	_lwp_mutex_trylock
 #define	pthread_mutex_unlock	_lwp_mutex_unlock
 #define	pthread_self		_lwp_self
-#define	pthread_mutex_destroy
+#define	pthread_mutex_destroy(x)	(0)
 #endif
 #ifdef HAVE_MUTEX_UI_THREADS
 #define	pthread_cond_signal	cond_signal
@@ -207,8 +207,13 @@
 			 * SIGALRM for some purpose, e.g., the C library
sleep
 			 * call, and Solaris delivers the signal to the
wrong
 			 * LWP.
+			 *
+			 * _lwp_cond_wait on Solaris can return EINTR
(unlike
+			 * pthread_cond_wait which is guaranteed not to), so
+			 * check for that & loop
 			 */
-			if (ret != 0 && ret != ETIME && ret != ETIMEDOUT)
+			if (ret != 0 &&
+			    ret != ETIME && ret != ETIMEDOUT && ret !=
EINTR)
 				return (ret);
 		}
 
@@ -270,10 +275,10 @@
 
 		mutexp->locked = 0;
 
-		if ((ret = pthread_mutex_unlock(&mutexp->mutex)) != 0)
+		if ((ret = pthread_cond_signal(&mutexp->cond)) != 0)
 			return (ret);
 
-		if ((ret = pthread_cond_signal(&mutexp->cond)) != 0)
+		if ((ret = pthread_mutex_unlock(&mutexp->mutex)) != 0)
 			return (ret);
 	} else {
 		mutexp->locked = 0;

-- 
Alex Kiernan, Principal Engineer, Development, Thus PLC 



More information about the inn-workers mailing list