BIND 10 #901: Alter logging interface to use positional formatting for message arguments

Fri May 6 12:31:33 UTC 2011

#901: Alter logging interface to use positional formatting for message arguments
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  stephen
  stephen                            |                Status:  reviewing
                       Type:         |             Milestone:
  enhancement                        |  Sprint-20110517
                   Priority:  major  |            Resolution:
                  Component:         |             Sensitive:  0
  logging                            |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  0.0
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by vorner):

 * owner:  vorner => stephen

Comment:

 Hello

 Replying to [comment:4 stephen]:
 > '''src/lib/log/log_formatter.{cc,h}'''[[BR]]
 > Is there a need to have a copy constructor (or assignment operator) for
 the Formatter object?  arg() could be declared "Formatter&" (returning
 *this) and could increment nextPlaceholder_ when called.  It would avoid
 multiple objects being constructed and destroyed. (It would also allow
 message_ to be stored in the class itself.)

 Well, there's a need when it's returned from the error() and such
 functions (return from C++ function needs a copy constructor) and we can't
 really use a reference there. For one, we have no place to allocate it so
 it survives the end of function, for second, we need the destructor to be
 called after the last arg().

 It's true I could probably use the reference when returning from arg, but
 I needed the constructor anyway. And I wasn't sure if it's OK regarding
 the lifespan of temporary objects ‒ if I call `a().b().c();`, is it
 guaranteed for the result of a to survive until the end of c(), or is it
 allowed for a to be deleted after b() finishes? So I just used this way to
 be safe. This should not produce additional overhead, the data types used
 there are simple, it is trivial to prove there are no sideeffects, so
 usual data-flow analysis optimisation will just throw everything out and
 produce equivalent of:

 {{{#!C++
 if (formatter.active_) {
   replaceArg(formatter.message_, arg1, 1);
   replaceArg(formatter.message_, arg2, 2);
   logger.output(formatter.prefix_, formatter.message_);
 }
 }}}

 > replacePlaceholder(): I don't think that it should add output if a
 placeholder is not found - there may be a valid reason for a missing
 placeholder, especially if the message is translated into multiple
 languages.  A rather contrived example (the first that came into my head)
 is "%1 sheep" in English and "%1 mouton%2" in French, called with
 > {{{
 > .arg(n).arg((n == 1) ? "" : "s")
 > }}}

 I believe this is quite rare (anyway, this exact example will not work
 anyway, because you want to translate to Czech then and you have „ovce“ in
 singular and „ovcí“ in plural, then you want to translate to Chinese,
 which won't use 's' for sure). On the other hand, I think it's quite
 common for someone to make a typo, forget the placeholder, place '%s'
 instead of '%1', etc. This way, not only we know that there's problem with
 the message. We also know what is the value of parameter, which might be
 valuable, even if the aestetic effect of our log files suffer little bit
 by it.

 > replacePlaceholder(): it may be more flexible to replace all instances
 of a placeholder in a string rather than the first one.  (To avoid
 recursion, the second and subsequent searches for placeholders would need
 to start beyond the replaced text.)

 OK.

 > replacePlaceholder(): is there any reason why this is not a private
 method of the class?

 Yes, actually, there are two:
  * The class is a template, therefore even the function would be
 templated, which is not needed at all.
  * The function might be useful by itself. In fact, it's called in
 formatting in the message compiler, when an exception is caught.

 > '''src/lib/log/macros.h'''[[BR]]
 > In some cases in C/C++, the language requires a statement but the logic
 of the program does not.  In this case a null statement - the semi-colon
 by itself - is used, e.g.
 > {{{
 > for (len = 0; str[len]; ++len)
 >    ;
 > }}}
 > Braces are used to surround a block of statements, but I've not managed
 to find anything that explicitly states that it is valid for them to
 surround an empty block.  Would it be safer for the braces to include a
 single semi-colon?

 Well, for one, it's implied. A block is sequence of statements and a
 sequence can, of course, be of zero length.

 And it's really widely used. Not so often in if statement, but we have
 bunch of empty functions and empty constructors/destructors. So I wouldn't
 be afraid. But if you want it there, I have no problem in including one
 there, only the visual effect would be affected.

 > It might be useful to include macros.h in logger.h; that way the macros
 are available whenever the logger is available.

 I don't like it. As I said on jabber, I'd like to be able to use the
 logger without defining any macros. Macros may bring define collisions, or
 do nasty things with code (if there's a header that has a function called
 `LOG_DEBUG` and actually uses it in one of its own inline functions, the
 header would be really unhappy about this). So I prefer not including it,
 if it is safe to include and use these macros, one can do it manually, if
 it's problematic, then at last the core logger can be included.

 But what about doing it the other way around, include logger in macros?

 > '''src/lib/log/tests/log_formatter_unittest.cc'''[[BR]]
 > In the "active" test, the ASSERT_LE test seems superfluous since the
 next test is EXPECT_EQ on the same variable.

 Well, if there's no output, then the other tests would crash, so we can't
 continue. But if there are more outputs, we want to continue and check the
 value of the output, but still report there are too many. Maybe this is
 little bit overcomplicated. Should I just replace it with ASSERT_EQ and
 not care about the output in such situation?

 > Should add a test to ensure that substituting "%1" with "%1" does not
 cause an endless loop.

 Yes, I added it when doing the replacement of multiple placeholders. I
 didn't get the idea this could happen, when I was thinking about single
 replacement.

 > '''!ChangeLog entry'''[[BR]]
 > The logging code has been provided in a previous release, so I think
 there should be a !ChangeLog entry for this.

 So, let's say this?

 {{{
 [func]*
 The logging interface changed slightly. We use
 logger.foo(MESSAGE_ID).arg(bar);
 instead of logger.foo(MESSAGE_ID, bar); internally. The message
 definitions use
 '%1,%2,...' instead of '%s,%d', which allows us to cope better with
 mismatched
 placeholders and allows reordering of them in case of translation.
 }}}

-- 
Ticket URL: <http://bind10.isc.org/ticket/901#comment:5>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development