[bind10-dev] Questions on the Logging API

Fri Dec 17 12:05:52 UTC 2010

On 17 Dec 2010, at 09:43, Michal 'vorner' Vaner wrote:

Hello

> Reading the wiki page, I have few notes. First, what I think we want:
> • We want to be able to grep the source code for the message or code or
>  something (eg, what part of code did produce this message).

Each message should include something that does not change, even if the text of the message does (this will allow support to trace any problems).  So messages would be of the form:

   MULQNAME: a question section may have only one qname, packet received that has 6

The identifier of the message is also the symbol name, so the the relevant statement that generates it would be something like:

log.error(DLB_MULQNAME, nquestion);

... and a grep for MULQNAME will identify the section of code.  As the idea of symbolic names for the messages might be seen as too much like VMS or Windows, we could go for numeric identifiers, e.g.

   DLB1023:  a question section may have only one qname, packet received that has 6

However, aas we wouldn't want to use numeric values in the code, (DLB1023 is a lot less informative than DLB_MULQNAME), we would need some intermediate stage - grep for DLB1023 in the message files to get MULQNAME then grep for that. 

> • We might want to be able to run without the message file found, eg have a
>  fallback compiled into the binary or have the identifiers be the message
>  texts (something GNU GetText does AFAIK).

The idea is that the messages are replaceable during the build process.  If a site does not supply a replacement message file, the original file supplied with the distribution is used.  And even if a replacement message file is supplied, it is merged with the distribution file, it does not replace it.  In this way, if we were to upgrade BIND and add new messages, BIND would still build with a message replacement file for the previous version even if it did generate warnings about some messages not being replaced.

> • Writers of extensions (if it is data source plugin, some filter, other module,
>  whatever) should be able to do so without recompilation of the rest and their
>  extension should be installable by simply unpacking the package.

We could do that, using dynamically loaded libraries. But we may want to look at it in other contexts (e.g. allowing a call-out to a user-written function to inspect a query received by the nameserver before passing it through for processing).

> • We might want to transfer the logging events into a single module/process or
>  over the network. We might want to use the identifiers only at that point.

If you are sending logging information, you need (a) something that describes what is being logged and (b) any parameters.  When that gets translated into a form that can be interpreted is up to the application.  Some systems do a late binding of the message with text - the data is stored in binary format and is only converted into readable form at the time it is inspected (e.g. Windows event logger).  That requires the messages be in a DLL that the program can bind with.

But you would still need an identifier with the message, even if it were only a number.

> • We might want to produce a translatable log as well as the usual one. So only
>  the identifiers and values would be put there in machine parsable format (for
>  example JOSN) and it could be transformed by the message files to produce log
>  in any language.

OK, I see.  You are saying that you might want:

   MULQNAME: a question section may have only one qname, packet received that has 6

... written to the log file but something like:

  { "id": "MULQNAME", "args": [6] }

... written to a network port.

Yes, that is do-able...

Perhaps the most difficult part about it is packaging up the various types of arguments.  I would think that the easiest thing is to convert them all to strings at the point at which the message object (containing the condition code and arguments) is created.

> We should be able to have some kind of minimal machine
> parsable output (for transfer over the network and to store for the translatable
> log, which might have even binary format for speed/size).

Converting arguments to a string would allow for a simpler binary format stream, e.g.

<message size><condition code><arg count><string1><\0><string2><\0>...

We would need to be disciplined though.  For the information to be interpreted with later versions of BIND-10, message numbers and meaning of arguments must not change.  So new messages should only be added to the end of the message file and messages that become unused are never removed.

> I'm not sure if we need to write the message compiler ourself. There must be
> some already out there.

I've not found one.  But the one I'm proposing is fairly trivial - just converting a message file into a C++ header file.  And it would be one less package to download in order to install BIND-10.

> And it might be possible not to have explicit message files, because it is
> tiresome to go between code and the message file, it would create more
> conflicts, etc. I think it would be easier if the compiler could pick the
> messages out of the files by itself.

It's not particularly tiresome.  When creating/editing a package you create/edit the message file at the same time.  And the message file -> header file step is hidden in the build procedure.  The chances of conflicts are negligible - message in different files have different facilities and hence different symbol names.

I'm not sure what you mean when you say "the compiler could pick the messages out of the files by itself"?  It needs to know what to pick - which leads us back to the idea that each message has some unique identification.

> And I think if we go with the numbers (which seems to make sense if we want to
> do the sending of data trough network some time in future), we need to have two
> numbers there ‒ the facility/module and the message itself (otherwise two
> extensions oblivious of each other will clash, this way it would be possible for
> them to at last synchronise the facility numbers and have whatever messages they
> want).

The wiki article suggested that, putting the facility code into the most significant 16 bits and the error code into the least significant 16 bits.  So a single 32-bit number holds both bits of information. (In fact you can put a lot of information into a 32-bit number - see for example the information encoded into a VMS condition code -  http://h71000.www7.hp.com/doc/82final/5973/5973pro_022.html)

Stephen