BIND 10 #1518: 0 byte rotated log files when changing settings

BIND 10 Development do-not-reply at isc.org
Thu Jan 5 15:30:56 UTC 2012


#1518: 0 byte rotated log files when changing settings
-------------------------------------+-------------------------------------
                   Reporter:  jreed  |                 Owner:  UnAssigned
                       Type:         |                Status:  reviewing
  defect                             |             Milestone:
                   Priority:         |  Sprint-20120110
  critical                           |            Resolution:
                  Component:         |             Sensitive:  0
  logging                            |           Sub-Project:  Core
                   Keywords:         |  Estimated Difficulty:  7 (but see
            Defect Severity:  High   |  comment)
Feature Depending on Ticket:  none   |           Total Hours:  0
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by stephen):

 * owner:  stephen => UnAssigned
 * status:  assigned => reviewing


Comment:

 Firstly, the problem is not one of lost data.  What is happening is that
 when the log settings are changed, a small number of new log files are
 created with a size of zero (or a few hundred bytes).  If the logging is
 configured such that only a small number of files are retained, these
 zero-length log files take up a number of that limit and older files are
 deleted. (Upping the maximum number of versions to something like 20 and
 looking at the i-node numbers before and after changing the log settings
 shows what is happening.)

 The reason why these files are created is not completely clear, but it is
 to do with the fact that there are multiple processes logging to the same
 file.  Currently when logging settings are changed, the logging code
 clears all appenders (the component that writes to the file) and sets up
 new ones. The effect of this is that for each process, all log files it
 has open are closed and then re-opened.

 Ideally each process should reopen the same file as it closed.  However, I
 speculate that when a BIND 10 process attempts to reopen the log file,
 sometimes the operation fails and for some reason a file rollover is
 triggered.  (A file rollover is where old log files are renamed with a
 higher a version number and a new current log file opened.)  Since there
 are several BIND 10 processes, there can be several rollovers, hence the
 empty files.  Looking at the files opened by the BIND 10 processes
 confirms that view - after a logging reconfiguration we can end up with
 different BIND 10 processes pointing to different versions of the log
 file. (Processes do not necessarily all end up pointing to different log
 files - two or three processes may end up pointing to the same file.)
 This observation also explains why some files are zero length and some
 only a few hundred bytes - it depends on whether the process in question
 generated a log message when it received a configuration change: if so,
 the message was logged to the newly-opened log file.

 So we have one problem with a change of logging configuration.  However,
 thinking about it, even if the reconfiguration did not generate redundant
 files, we would have still have a problem when a file rollover takes
 place.  To illustrate this, suppose we have processes Alpha and Beta both
 writing to the file "out.log".  Suppose also that the file has reached its
 maximum size.  When Alpha next writes to the log file, the size of the
 file is checked.   Since it has reached its maximum size, Alpha closes the
 file and performs the rollover:  "out.log" is renamed "out.log.1", a new
 "out.log" is created, and Alpha writes the message to that.  During this
 process, Beta still has the same file open, so after the rename, it is
 "out.log.1" that Beta has open.  If Beta now writes to the log file, the
 same thing happens.  The check on the file size causes the file to be
 closed and the rollover to take place: "out.log.1" is renamed "out.log.2",
 "out.log" is renamed "out.log.1" and Beta opens "out.log"  and writes the
 message to it.  So although we started with both Alpha and Beta logging to
 the same file, we end up with Alpha and Beta logging to different files.
 (I don't think we have seen this yet, but only because tests tend to
 concentrate on a single component at a time.)

 In summary, we have two problems with the rotating log files if all
 processes are set to log to the same file:

 1. When logging is reconfigured, instead of processes closing the current
 log file and reopening it, sometimes the BIND 10 processes can end up
 opening different log files.
 2. Even if (1) were not a problem, we can expect processes to end up
 logging to different log files when the log file size reaches the maximum
 limit.

 '''Suggestions'''[[BR]]
 1. I think we will need to enable log4cplus debugging output to chase down
 the first problem. (It seems to be on by default but output is written to
 stderr, which is not captured by processes started by Boss.)  It may be
 that the problem is just that a number of processes attempt to open the
 same file together and the error causes the rollover.  A small random wait
 may be all that is required.

 2. I think the second problem could be mitigated by using a class derived
 from the log4cplus !RollingFileAppender that overrides some of its
 methods.  When a rollover happens, instead of closing the current file,
 rolling the file names and opening a file, it should close the file, open
 what it thinks should be the current file and try to write again.  Only if
 this file exists and is at maximum size should the rollover be performed.
 We might also want to consider the idea of a lock file to prevent two
 processes trying to rollover a particular log file at the same time. (In
 fact, use of a lock file might well solve the first problem as well.)

 ''I have put this for review in case anyone else has an
 insight/suggestions for this.  Before the next sprint planning meeting I
 will close it and create tickets for the agreed solutions.''

-- 
Ticket URL: <http://bind10.isc.org/ticket/1518#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list