innd logging conventions

Thu Dec 21 06:53:00 UTC 2000

innd(8) records the following convention for innd logging:

       Innd  also  makes  extensive  reports through syslog.  The
       first word of the log message will be the name of the site
       if  the  entry  is  site-specific (such as a ``connected''
       message).  The first word will be ``SERVER'' if  the  mes-
       sage  relates  to  the  server itself, such as when a read
       error occurs.

       If the second word is the four letters  ``cant''  then  an
       error is being reported.  In this case, the next two words
       generally name the system call  or  library  routine  that
       failed,  and  the  object  upon which the action was being
       performed.  The rest of the line may contain other  infor-
       mation.

       In other cases, the second word attempts to summarize what
       change has been made, while the rest  of  the  line  gives
       more  specific  information.  The word ``internal'' gener-
       ally indicates an internal logic error.

What are people's general feelings about these conventions?  Are people
actively taking advantage of them?

The SERVER part is currently implemented in an odd way, and a way that
tends to encourage people to accidentally leave it off; pretty much all
the innd log messages are in the form "%s message" with LogName given as
the first parameter.  LogName is a global variable for reasons that I
don't really understand (I may be missing something).

One of the difficulties in maintaining this convention in a way that is
applied consistently is that some of the error messages generated by innd
come from other portions of INN, particularly the storage manager but over
time from libinn as well.  Those portions obviously won't be prepending
SERVER since other parts of INN may call the same routines.

One way that I can handle this is to create customized INN error handlers
for syslog reporting that vsnprintf to a buffer in order to prepend SERVER
and then log that.  It will make the logging a bit slower, but we
shouldn't have too many messages of this type (the frequent messages are
the ones logged with LOG_NOTICE and those don't go through warn/die).  It
will make it more complicated, and means pushing everything through an
otherwise unnecessary vsnprintf.

Another option is to drop the SERVER bit, which will mean removing the
first paragraph above.  innreport can still tell the difference by looking
for the particular pattern of the messages it knows about, and people will
be able to tell the difference by reading the message.

A third option is to keep doing what we're doing now, which is to add
SERVER to the internal innd messages but not to the messages from other
parts of INN called by innd.  I'm not a big fan of this, but it's
obviously the simplest.

I may be missing other alternatives.

Also, should we keep the "cant" convention?  I'm a little bothered by the
grammatic incorrectness, but I'm leaning towards keeping it on the grounds
that changing things unnecessarily has little point.  Not every message
that should have cant currently does; if we keep it, I'll try to pick
those up as we go.  (It's tempting to use "can't" instead, though; it
really is.)

Finally, we actually also have a convention for logging that I'm going to
try to apply a bit more uniformly; LOG_CRIT for things that cause INN to
throttle or shutdown, LOG_ERR for things that may cause articles to be
lost or other similar levels of error, and LOG_WARNING for other things
that are wrong but not as bad.  My intention is to hook die up to
LOG_CRIT, warn up to LOG_ERR, and have code that wants to warn about
something less important just call syslog with LOG_WARNING directly.

Does this all sound good to people?  Input?  Suggestions?

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>