INN commit: trunk (4 files)
INN Commit
rra at isc.org
Mon Jul 18 22:41:23 UTC 2011
Date: Monday, July 18, 2011 @ 15:41:23
Author: iulius
Revision: 9284
improve scripts to send Path: statistics
Add two flags to sendinpaths: -k and -r permit to control the interval
of days for processing dump files.
It will allow a proper generation of daily statistics.
Also fixed an issue with statistics that could be missing for a couple of
days when monthly sent.
Improve our documentation.
Modified:
trunk/backends/sendinpaths.in
trunk/doc/pod/news.pod
trunk/doc/pod/ninpaths.pod
trunk/doc/pod/sendinpaths.pod
-------------------------+
backends/sendinpaths.in | 93 ++++++++++++++++++++++++++++++++++++----------
doc/pod/news.pod | 9 ++++
doc/pod/ninpaths.pod | 50 +++++++++++++++---------
doc/pod/sendinpaths.pod | 28 ++++++++++---
4 files changed, 136 insertions(+), 44 deletions(-)
Modified: backends/sendinpaths.in
===================================================================
--- backends/sendinpaths.in 2011-07-17 19:42:02 UTC (rev 9283)
+++ backends/sendinpaths.in 2011-07-18 22:41:23 UTC (rev 9284)
@@ -1,44 +1,97 @@
#!/bin/sh
# fixscript will replace this line with code to load innshellvars
#
-# Submit path statistics based on ninpaths
+# Submit path statistics based on ninpaths.
# $Id$
-# Assuming the ninpaths dump files are in ${MOST_LOGS}/path/inpaths.%d
+# Assuming the ninpaths dump files are in ${MOST_LOGS}/path/inpaths.%d files.
cd ${MOST_LOGS}/path
ME=`${NEWSBIN}/innconfval pathhost`
-report=30
-keep=14
-TMP=""
-defaddr="pathsurvey at top1000.org top1000 at anthologeek.net"
+USAGE="Usage: sendinpaths [-n] [-k keep-days] [-r report-days] [address [address ...]]"
+NOMAIL=false
+MAILTO=""
+DEFAULTMAILTO="pathsurvey at top1000.org top1000 at anthologeek.net"
+
+# Default to report up to 32 days (ideal for monthly statistics). It works fine
+# for daily stats too because already processed dump files are deleted by default
+# (0 day of kept articles).
+REPORT=32
+KEEP=0
+NINPATHS_ARGS=""
+
+# Parse command-line arguments.
+while [ $# -gt 0 ]
+do
+ case "$1" in
+ -k)
+ case "$2" in
+ *[^0-9]*)
+ echo "Argument to -k flag must be an integer."
+ exit 1
+ ;;
+ esac
+ KEEP=$2
+ shift
+ ;;
+ -n)
+ NOMAIL=true
+ ;;
+ -r)
+ case "$2" in
+ *[^0-9]*)
+ echo "Argument to -r flag must be an integer."
+ exit 1
+ ;;
+ esac
+ REPORT=$2
+ shift
+ ;;
+ -*)
+ echo $USAGE
+ exit 1
+ ;;
+ *)
+ MAILTO="${MAILTO} $1"
+ ;;
+ esac
+ shift
+done
+
# Renice to give other processes priority, since this isn't too important.
renice 20 -p $$ > /dev/null 2>&1
-# Make report from (up to) $report days of dumps
-LOGS=`find . -name 'inpaths.*' ! -size 0 -mtime -$report -print`
+# Make report from (up to) $REPORT days of dumps.
+LOGS=`find . -name 'inpaths.*' ! -size 0 \( -mtime -${REPORT} -o -mtime ${REPORT} \) -print`
if [ -z "$LOGS" ] ; then
- echo "No data has been collected this month!"
+ echo "No data has been collected since the last run of this script!"
exit 1
fi
-# for check dumps
+# Process dumps.
for i in $LOGS
do
- ninpaths -u $i -r $ME > /dev/null 2>&1
- if test $? -eq 0; then :
- TMP="$TMP -u $i"
- fi
+ ninpaths -u ${i} -r ${ME} > /dev/null 2>&1
+ if test $? -eq 0 ; then
+ NINPATHS_ARGS="${NINPATHS_ARGS} -u ${i}"
+ else
+ echo "Skipping unrecognized inpaths file ${i}"
+ fi
done
-if [ "$1" = "-n" ] ; then
- ninpaths $TMP -r $ME
+if [ -z "${NINPATHS_ARGS}" ] ; then
+ echo "No valid data has been collected since the last run of this script!"
+ exit 1
+fi
+
+if [ "${NOMAIL}" = "true" ] ; then
+ ninpaths ${NINPATHS_ARGS} -r ${ME}
else
- ninpaths $TMP -r $ME |\
- $MAILCMD -s "inpaths $ME" ${1:-$defaddr}
- # remove dumps older than $keep days
- find . -name 'inpaths.*' -mtime +$keep -exec rm '{}' \;
+ ninpaths ${NINPATHS_ARGS} -r ${ME} |\
+ ${MAILCMD} -s "inpaths ${ME}" ${MAILTO:-$DEFAULTMAILTO}
+ # Remove dumps older than $KEEP days.
+ find . -name 'inpaths.*' \( -mtime +${KEEP} -o -mtime ${KEEP} \) -exec rm '{}' \;
fi
exit 0
Modified: doc/pod/news.pod
===================================================================
--- doc/pod/news.pod 2011-07-17 19:42:02 UTC (rev 9283)
+++ doc/pod/news.pod 2011-07-18 22:41:23 UTC (rev 9284)
@@ -162,6 +162,15 @@
=item *
+It is now possible to properly generate daily statistics with B<sendinpaths>
+thanks to the new B<-k> and B<-r> flags that permit to control the interval
+of days for processing dump files.
+
+Also fixed an issue with statistics that could be missing for a couple of
+days when monthly sent.
+
+=item *
+
B<cnfsheadconf> now properly recognizes continuation lines in
F<cycbuff.conf>, that is to say lines ending with a backslash (C<\>).
Thanks to John S<F. Morse> for the bug report.
Modified: doc/pod/ninpaths.pod
===================================================================
--- doc/pod/ninpaths.pod 2011-07-17 19:42:02 UTC (rev 9283)
+++ doc/pod/ninpaths.pod 2011-07-18 22:41:23 UTC (rev 9284)
@@ -19,11 +19,11 @@
into the report. The purpose of the final report is to summarize the
frequency of occurrence of sites in the Path: headers of articles.
-Some central sites accumulate the Path: data from many news servers running
-this program or one like it, and then report statistics on the most
-frequently seen news servers in Usenet article Path: lines. The
-B<sendinpaths> shell script can be run once a month to mail the
-accumulated statistics to such a site and remove the old dump files.
+Some central sites accumulate the Path: data from many news servers
+running this program or one like it, and then report statistics on
+the most frequently seen news servers in Usenet article Path: lines.
+The B<sendinpaths> shell script can be run daily to mail the accumulated
+statistics to such a site and remove the old dump files.
You can get a working setup by doing the following:
@@ -33,31 +33,45 @@
Create a directory at I<pathlog>/path (replacing I<pathlog> here and in
all steps that follow with the full path to your INN log directory).
+Do not change the name of the C<path> subdirectory because it is used
+by B<sendinpaths>.
=item 2.
Set up a channel feed using an entry like:
- inpaths!:*:Tc,WP:ninpaths -p -d <pathlog>/path/inpaths.%d
+ inpaths!:*:Tc,WP:<pathbin>/ninpaths -p -d <pathlog>/path/inpaths.%d
if your version of INN supports C<WP> (2.0 and later all do). Replace
+<pathbin> with the full path to your INN binaries directory, and
<pathlog> with the full path to your INN log directory.
=item 3.
-Enter into your news user crontab something like:
+Run the following command to start logging these statistics:
- 6 6 * * * ctlinnd flush inpaths!
+ ctlinnd reload newsfeeds 'inpaths feed setup'
+=item 4.
+
+Enter into your news user crontab these two lines:
+
+ 6 6 * * * <pathbin>/ctlinnd flush inpaths!
+ 10 6 * * * <pathbin>/sendinpaths
+
(the actual time doesn't matter). This will force B<ninpaths> to generate
-a dump file once a day.
+a dump file once a day. Then, a few minutes later, B<sendinpaths> collects
+the dumps, makes a report, sends the collected statistics, and deletes
+the old dumps.
-=item 4.
+Note that you can manually generate a report without mailing it, and
+without deleting processed dump files, with C<sendinpaths -n>.
-Once per month, run the B<sendinpaths> script, which collects the dumps,
-makes a report, and then deletes the old dumps. (You can generate a
-report without mailing it and without deleting it with C<sendinpaths -n>.)
+=item 5.
+In a couple of days, check that your daily statistics properly appear in
+L<http://www.top1000.org/>.
+
=back
=head1 OPTIONS
@@ -88,8 +102,8 @@
=item B<-v> I<level>
-Set the verbosity level of the report. Valid values for I<level> are 0,
-1, and 2, with 2 being the default.
+Set the verbosity level of the report. Valid values for I<level> are C<0>,
+C<1>, and C<2>, with C<2> being the default.
=back
@@ -139,14 +153,14 @@
If your INN doesn't have the C<WP> feed flag (1.5 does not, 1.6 and 1.7 do,
2.0 and later all do), use the following F<newsfeeds> entry:
- inpaths!:*:Tc,WH:ginpaths
+ inpaths!:*:Tc,WH:<pathbin>/ginpaths
where B<ginpaths> is the following script:
#!/bin/sh
- exec egrep '^Path: ' | ninpaths -p -d <pathlog>/path/inpaths.%d
+ exec egrep '^Path: ' | <pathbin>/ninpaths -p -d <pathlog>/path/inpaths.%d
-replacing <pathlog> as above.
+replacing <pathbin> and <pathlog> as above.
=head1 HISTORY
Modified: doc/pod/sendinpaths.pod
===================================================================
--- doc/pod/sendinpaths.pod 2011-07-17 19:42:02 UTC (rev 9283)
+++ doc/pod/sendinpaths.pod 2011-07-18 22:41:23 UTC (rev 9284)
@@ -4,12 +4,13 @@
=head1 SYNOPSIS
-B<sendinpaths> [B<-n> | "I<address> [I<address> ...]"]
+B<sendinpaths> [B<-n>] [B<-k> I<keep-days>] [B<-r> I<report-days>]
+[I<address> [I<address> ...]]
=head1 DESCRIPTION
B<sendinpaths> checks I<pathlog>/path for B<ninpaths> dump files, finds
-dump files generated in the past 30 days, makes sure they are valid
+dump files generated in the past I<report-days> days, makes sure they are valid
by running B<ninpaths> on each one and making sure the exit status is
zero, and passes them to B<ninpaths> to generate a cumulative report.
By default, that report is mailed to the e-mail addresses configured at
@@ -18,7 +19,7 @@
useful statistics: see L<http://www.top1000.org/> for more information.
When finished, B<sendinpaths> deletes all dump files in I<pathlog>/path
-that are older than 14 days (configurable at the beginning of the script).
+that are older than I<keep-days> days.
For more information on how to set up B<ninpaths>, see ninpaths(8).
@@ -26,21 +27,36 @@
=over 4
+=item B<-k> I<keep-days>
+
+After having processed dump files, B<sendinpaths> removes those that are
+older than I<keep-days> days. The default is C<0>, that is to say to
+remove all dump files.
+
+Setting I<keep-days> to another value can be useful for debugging purpose
+because it permits to keep a few dump files.
+
=item B<-n>
Don't e-mail the report; instead, just print it to standard output. Don't
delete old dump files.
+=item B<-r> I<report-days>
+
+Process dump files generated during the last I<report-days> days.
+The default is C<32>, that is to say to process all the dump files that
+have been generated during the last 32 days (if, of course, they have
+not been deleted yet by a previous run of B<sendinpaths> according to the
+value set by the B<-k> flag).
+
=item I<address> ...
E-mail the report to the mentioned addresses instead of the default ones.
Several addresses can be used, separated by whitespace. For instance,
for two adresses:
- sendinpaths "pathsurvey at top1000.org top1000 at anthologeek.net"
+ sendinpaths pathsurvey at top1000.org top1000 at anthologeek.net
-The quotes can be omitted when only an address is specified.
-
=back
=head1 HISTORY
More information about the inn-committers
mailing list