BIND 10 #758: Conversion of bind10 program to use new logging interface

Fri Jul 1 11:29:09 UTC 2011

#758: Conversion of bind10 program to use new logging interface
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  stephen                            |                Status:  reviewing
                       Type:         |             Milestone:
  enhancement                        |  Sprint-20110712
                   Priority:  major  |            Resolution:
                  Component:         |             Sensitive:  0
  logging                            |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  2.0
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by jelte):

 * owner:  jelte => jinmei

Comment:

 Replying to [comment:6 jinmei]:
 > '''general'''
 >
 > - I made some trivial changes.  but like the previous case I may have
 >   made a mistake, so please check them
 >

 yes they look good, thanks

 > - I've seen some inconsistency about "msgq" vs "message bus" vs
 >   "message queue" with or without "daemon", etc.  Log IDs generally
 >   have "MSGQ".  Maybe we should leave them as is and note this in a
 >   msgq replacement ticket (if any, or if it doesn't exit create a
 >   separate ticket for cleanup).  Of course we could provide
 >   consistency within this ticket.
 >

 Right, I was aware of this while I was writing those, and the problem is
 that some are highly specific for msgq, i.e. our implementation of it.
 Depending on what we end up with some of these messages may need to be
 changed or removed. So I think we should indeed leave them and make a note
 in the msgq replacement ticket. Will do so once i'm done with this.

 > '''specifics'''
 >
 > - shouldn't this better be "bind10", "boss", or "bob" etc?  At least
 >   "b10-bind10" sounds awkward (and in fact it's different from the
 >   program name):
 > {{{
 > +isc.log.init("b10-bind10")
 > }}}

 The current name is an effort to stay consistent with the rest of the
 daemons... I had considered multiple alternatives;
 - 'b10-boss', with the current convention of doing the actual logging as a
 second-level logger with the name of the daemon, it would become
 'b10-boss.bind10' in output, which seems even weirder.
 - just 'b10' (making the actual output 'b10.bind10') but then to configure
 'full' logging for boss module (and all libs it might use) it would be
 'b10'
 - 'b10-bind10' seemed to have the least drawbacks in the end (being the
 most predictable imo)

 But come to think of it, when looking from the administrators point of
 view, there may be a deeper problem: i think bind10[.py] has the wrong
 name.
 So i'm thinking that we should actually change that too, and make 'bind10'
 a script that calls b10-boss, so this will be 'just another' module in the
 process list where all the others already start with b10-. This may be
 related to an effort of getting 'internal' and 'external' names of
 modules, processes and config consistent, but in the case of bind10.py
 there is even more of a disconnect than for the others right now.

 Changed the root logger name to 'b10-boss' and the actual logger to 'boss'
 (being closest to the current command/configuration name of the module).

 >
 > - BIND10_START_AS_NON_ROOT: in my weak opinion "info" (or "notice")
 >   seems to be better than warn in this case.  I would generally
 >   consider "warn" an error (but not fatal one, e.g., when a given
 >   config parameter is too small/large, but there's a reasonable
 >   default to which we can fall back and we use it)
 > - BIND10_START_AS_NON_ROOT: If we have a ticket for the socket
 >   creator, I'd add a note to it about this description:

 ok, there's two tickets, neither of which is 'the ticket' for inclusion
 (#800 and #801), but i added them anyway.

 >
 > - BIND10_KILL_PROCESS: I'd explain the meaning of parameter (%1)
 >

 made it 'kill the process with the given name'

 > - BIND10_STARTING_PROCESS_PORT: maybe we should change '%2:%3' to
 >   '%2#%3' as we discussed on jabber (and I saw a ticket about this)
 >

 Is that a common notation? I don't think i've ever seen it used (well,
 apart from bind9, apparently). I realize addr:port is ambiguous if it's
 ipv6 (where [addr]:port is usually used), but # seems quite unusual to me.
 Changed it to # anyway, in a step towards consistency.

 > - maybe you should adjust the method document for log_starting because
 >   it's now not configurable only by the verbose option.
 >

 sure, done

 > - in my test run (with -v, from source), I couldn't find logs in
 >   log_starting().  am I missing something?  hmm, perhaps logs before
 >   creating ccsession aren't dumped?  FYI: this is what I've got
 >   (removing timestamps):
 > {{{
 > 2011-06-30 18:18:06.566 INFO  [b10-bind10.bind10] BIND10_STARTING
 starting BIND10: bind10 20110223 (BIND 10 20110519)
 > 2011-06-30 18:18:07.787 INFO  [b10-bind10.bind10]
 BIND10_READING_BOSS_CONFIGURATION reading boss configuration
 > 2011-06-30 18:18:07.787 INFO  [b10-bind10.bind10]
 BIND10_CONFIGURATION_START_AUTH start authoritative server: True
 > 2011-06-30 18:18:07.787 INFO  [b10-bind10.bind10]
 BIND10_CONFIGURATION_START_RESOLVER start resolver: False
 > 2011-06-30 18:18:07.951 INFO  [b10-bind10.bind10]
 BIND10_STARTUP_COMPLETE BIND 10 started
 > 2011-06-30 18:18:09.384 INFO  [b10-bind10.bind10] BIND10_RECEIVED_SIGNAL
 received signal SIGINT
 > 2011-06-30 18:18:09.385 INFO  [b10-bind10.bind10] BIND10_SHUTDOWN
 stopping the server
 > 2011-06-30 18:18:10.386 INFO  [b10-bind10.bind10] BIND10_SEND_SIGTERM
 sending SIGTERM to b10-msgq (PID 71408)
 > 2011-06-30 18:18:10.486 INFO  [b10-bind10.bind10]
 BIND10_SHUTDOWN_COMPLETE all processes ended, shutdown complete
 > }}}

 no i originally set them to DEBUG, since i didn't like the 3 terminal
 screens of STARTING/STARTED messages. So they are not printed by default
 (since default is INFO). Changed STARTING message to severity INFO, and i
 left STARTED at DEBUG, to make it the output a bit less. But perhaps we
 should remove the STARTED log completely, and only log something if
 starting failed.

 >
 > - BIND10_MSGQ_ALREADY_RUNNING: "appears to me" should be "appears to
 be"?
 >

 ack

 > - BIND10_SHUTDOWN: in "It will send shutdown commands to each
 >   process.", "shutdown commands" should be "a shutdown command" if you
 >   say "each process"?
 >

 ok

 > - BIND10_SEND_SIGTERM: I'd use "notice" (if we have it) because it's
 >   not an unexpected case even if not considered an error.  also, you
 >   may want to refer to BIND10_SHUTDOWN to explain when this happens.
 >   Same for BIND10_SEND_SIGKILL.
 >

 there's no notice, so info seems closest (which it already is). There's
 only one case where this isn't unexpected, and that is for msgq, btw.

 > - BIND10_PROCESS_ENDED_NO_EXIT_STATUS: I'd use a stronger level than
 >   "info" (see the previous point).  Same for other logs in this method.
 >
 > - BIND10_PROCESS_ENDED_WITH_EXIT_STATUS: as I said on jabber in a
 >   different context I personally don't like to repeat the same content
 >   in multiple places (most of it is a duplicate of NO_EXIT_STATUS),
 >   especially for text this long.  But understanding this may be a
 >   matter of opinion, I'd leave it to you.
 >

 I actually disagree with the general premise, but in this case a reference
 will do (due to the way these messages are tied together in the way that
 they are conditionally called)

 > - BIND10_INVALID_USER: very minor point, but the use of '' seems to be
 >   inconsistent.  Also, 'invalid user' seems to be too simple.  Maybe
 >   something like 'changing user to %1 failed'?
 >

 'failed' seems too general to me, imo it should be either invalid user or
 unknown user

 > - BIND10_SELECT_ERROR: is the select really used "to see if a child
 >   process has ended"? (just checking)
 >

 yes, it checks both the socket filehandle and a pipe where the signals get
 written to

 > - BIND10_MSGQ_DISAPPEARED: does bind10 really shut down in this case?
 >   From a quick look, it seems to just exit from the for loop and
 >   continue the while loop.  Would someone set 'runnable' to set false
 >   once this happens?
 >

 see below

 > - Is it okay to simply remove this?
 > {{{
 > -    sys.stdout.write("[bind10] BIND 10 exiting\n");
 > }}}

 I thought so, it is logger right after 'shutdown()' completes, which also
 prints a 'done shutting done' message.

 >
 > Oh, I missed your follow up comment (actually I read it but then forgot
 > it while I wrote my own review:-).  But it still doesn't seem to set
 > 'runnable'?  Perhaps you forgot to push your local change?
 >

 oops, i did add it, but apparently forgot to even commit it

 > BTW: if we haven't, I think we should have a test case for that
 > (which should go to a separate ticket though).
 >

 we don't, it would also require a bit of refactoring of bind10.py to get
 to that specific piece of code from a test. Created ticket #1088 for it.

-- 
Ticket URL: <http://bind10.isc.org/ticket/758#comment:9>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development