innwatch forks without reason

Julien ÉLIE julien at trigofacile.com
Sat Sep 13 17:04:44 UTC 2014


Hi again Lauri,

>>> innwatch apparently creates a child process only for sleeping and then
>>> waits on that process. The forked-off process is not killed by 'rc.news
>>> stop' (only its parent is), and will only die after it's done sleeping.
>>> If running under SMF on illumos/Solaris, this causes the service to
>>> likely drop into maintenance state (since not all processes die within
>>> timeout).
>>
>> Thanks for your bug report and also your patch.
> 
> Is the subprocess "sleep 600" properly killed with your patch?
> I see that it is still running on my system (but maybe it is not seen by 
> illumos/Solaris -- does it only look at innwatch?)

Here is a new suggestion of patch to deal with the problem:
* add a new "-i" flag to innwatch to specify how many seconds it should sleep when started;
* add a new "-i" flag to cnfsstat to specify how many seconds it should sleep when started;
* when innwatch is killed, also kill the forked-off process it uses when sleeping.

This allows to fix the issue you reported, and also to fix a similar issue
if stopping INN within 60 seconds after its start.

Here are the 3 files to patch.
Could you please test these changes on your illumos/Solaris and tell me if
they also fix your issue?
Thanks!




--- rc.news	2014-09-07 16:24:59.000000000 +0200
+++ /home/news/bin/rc.news	2014-09-13 18:46:35.000000000 +0200
@@ -32,7 +32,7 @@
     # Stop innwatch (if running)
     if [ -f $WATCHPID ]; then
 	kill `cat $WATCHPID`
-	rm -f $WATCHPID
+	rm -f $WATCHPID ${LOCKS}/LOCK.innwatch
     fi
 
     printf "Stopping innd: "
@@ -163,12 +163,12 @@
 # Gee, looks like lisp, doesn't it?
 ${DOINNWATCH} && {
     echo "Scheduled start of ${INNWATCH}."
-    ( sleep 60 ; ${INNWATCH} ) > /dev/null &
+    ${INNWATCH} -i 60 > /dev/null &
 }
 
 ${DOCNFSSTAT} && {
     echo "Scheduled start of cnfsstat."
-    ( sleep 60 ; ${PATHBIN}/cnfsstat -s -l -P ) > /dev/null &
+    ${PATHBIN}/cnfsstat -i 60 -s -l -P > /dev/null &
 }
 
 RMFILE=${MOST_LOGS}/expire.rm






--- innwatch	2014-09-07 16:24:59.000000000 +0200
+++ /home/news/bin/innwatch	2014-09-13 18:23:29.000000000 +0200
@@ -17,6 +17,9 @@
 PROGNAME=innwatch
 LOCK=${LOCKS}/LOCK.${PROGNAME}
 DAILY=${LOCKS}/LOCK.news.daily
+INNWATCHINITIALSLEEPTIME=1
+CHILDPID=
+
 ##  Where to put the timestamp file (directory and filename).
 TIMESTAMP=${LOCKS}/${PROGNAME}.time
 
@@ -41,14 +44,21 @@
 	shift
 	;;
     X-f*)
-	FILE=`expr "$1" : '-s\(.*\)'`
+	FILE=`expr "$1" : '-f\(.*\)'`
 	;;
+    X-i)
+        INNWATCHINITIALSLEEPTIME=$2
+        shift
+        ;;
+    X-i*)
+        INNWATCHINITIALSLEEPTIME=`expr "$1" : '-i\(.*\)'`
+        ;;
     X-l)
 	LOGFILE=$2
 	shift
 	;;
     X-l*)
-	LOGFILE=`expr "$1" : '-s\(.*\)'`
+	LOGFILE=`expr "$1" : '-l\(.*\)'`
 	;;
     X-t)
 	INNWATCHSLEEPTIME=$2
@@ -86,7 +96,7 @@
     exit 0
 }
 
-trap 'rm -f ${LOCK} ${WATCHPID} ; exit 1' 1 3 15
+trap 'rm -f ${LOCK} ${WATCHPID} ; test -n "${CHILDPID}" && kill ${CHILDPID} ; exit 1' 1 3 15
 echo "$$" > ${WATCHPID}
 
 ##  The reason why we turned innd off, and its, and our current state.
@@ -102,7 +112,9 @@
 ##  We need to remember the process ID of innd, in case one exits.
 ##  But we need to wait for innd to start before we can do that.
 while PID=`cat ${SERVERPID} 2>/dev/null`; test -z "${PID}"; do
-    sleep ${INNWATCHSLEEPTIME}
+    sleep ${INNWATCHINITIALSLEEPTIME} & CHILDPID=$!
+    wait
+    CHILDPID=
 done
 
 trap '(
@@ -126,8 +138,10 @@
 NEXTSLEEP=1
 HASEXITED=false
 
-while sleep ${NEXTSLEEP}
+while { sleep ${NEXTSLEEP} & CHILDPID=$! ; }
 do
+    wait
+    CHILDPID=
     NEXTSLEEP=${INNWATCHSLEEPTIME}
 
     ##  If news.daily is running, idle:  we don't want to change the






--- cnfsstat	2014-09-07 16:24:11.000000000 +0200
+++ /home/news/bin/cnfsstat	2014-09-13 18:45:18.000000000 +0200
@@ -29,12 +29,13 @@
 Summary tool for CNFS
 
 Usage:
-	$0 [-ahpPsv] [-c class] [-l [seconds]] [-m buffer]
+	$0 [-ahpPsv] [-c class] [-i seconds] [-l [seconds]] [-m buffer]
 
 	If called without args, does a one-time status of all CNFS buffers
 	-a:           print the age of the oldest article in the cycbuff
 	-c class:     print out status of CNFS buffers in that class
 	-h:           this information
+	-i seconds:   initial sleep of that many seconds at startup
 	-l [seconds]: loop like vmstat, default seconds = 600
 	-m buffer:    print out information suitable for MRTG
 	-p:           print out an MRTG config file
@@ -54,7 +55,7 @@
 my %opt = (c=>\$oclass, m=>\$obuffer);
 Getopt::Long::config('no_ignore_case');
 GetOptions(\%opt,
-           "-a", "-c=s", "-h", "-l:i", "-m=s",
+           "-a", "-c=s", "-h", "-i=i", "-l:i", "-m=s",
            "-p", "-P", "-s", "-v");
 
 usage() if $opt{'h'};
@@ -96,6 +97,13 @@
 mrtg($obuffer) if $obuffer;
 mrtg_config() if $opt{'p'};
 
+if(defined($opt{'i'}) && $opt{'i'} > 0) {
+    sleep($opt{'i'});
+    if (!$use_syslog) {
+        print STDOUT "$opt{'i'} seconds later:\n";
+    }
+}
+
 
 START:
 




That's all (man pages and changelog will also be updated).

-- 
Julien ÉLIE

« Quousque tandem ? » (Cicéron)


More information about the inn-workers mailing list