BIND 10 #2178: Logging for Out-of-zone data

BIND 10 Development do-not-reply at isc.org
Fri Aug 24 13:35:52 UTC 2012


#2178: Logging for Out-of-zone data
-------------------------------------+-------------------------------------
                   Reporter:  jreed  |                 Owner:  jinmei
                       Type:         |                Status:  reviewing
  defect                             |             Milestone:
                   Priority:         |  Sprint-20120904
  medium                             |            Resolution:
                  Component:  data   |             Sensitive:  0
  source                             |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  3
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by jelte):

 * owner:  jelte => jinmei


Comment:

 Replying to [comment:5 jinmei]:
 > In short, I'm okay with merging the branch.  I've fixed some small
 > things, and there are some other small open points, which I'd leave to
 > you.  But larger discussion points occurred to me while reviewing the
 > code (not only for the changes in this branch).
 >
 > Things already addressed:
 > - client_list_unittest.cc didn't compile for me because the result
 >   class doesn't have operator<< (Does it work for a newer version of
 >   gtest?)  I've fixed it myself and committed it.  And, while doing
 >   so, I've renamed negativeResult_ to negative_result_ so it confirms
 >   to the variable naming convention.

 interesting, it certainly compiled here (but i had been wondering why the
 EQ(==) had been used in other places, since operator== obviously is
 defined); fyi, I have gtest 1.6.0

 > - I did some other minor editorial fixes.
 >

 ok, they look fine.


 > Possible minor issues:
 > - `MasterLoadError` can only happen from here:
 > {{{#!cpp
 >                         if (type == "MasterFiles") {
 >
 finder->load(paramConf->get(*it)->stringValue());
 > }}}
 >   so we can narrow the try block.

 Yes, originally i had narrowed it, but then you do get an empty added zone
 (since the 'zonefinder' object is still added. Which would be solved by
 either checking that it succeeded, or doing cache->addZone() it in both
 branches. I did the latter for now, and narrowed the try.

 Passing this ticket back to you to check this change; discussion below is
 needed, but indeed out of scope for this ticket (except perhaps this first
 one about the unified form).

 > - I'd log the zone name and class in the unified form here:
 > {{{#!cpp
 >                     } catch (const isc::dns::MasterLoadError& mle) {
 >                         LOG_ERROR(logger, DATASRC_MASTERLOAD_ERROR)
 >                             .arg(mle.what());
 > }}}
 >

 I don't understand; the exception has zonename/class in it, or do you mean
 it should not be in the exception but in the log arguments?

 > About Changelog:
 > I think this is primarily a fix to lib(b10-)datasrc, and the change in
 > b10-auth is an effect of it that is just most visible (and maybe the
 > only at this moment) to the user.  So, I'd say something like this:
 >
 > {{{
 > libdatasrc: the data source client list class now ignores zone content
 > problems (such as out-of-zone data) in MasterFiles type zones, instead
 > of aborting the entire configuration.  It only logs an error, and all
 > other zones and datasources are still loaded. The error log message
 > has been improved to include the zone origin and source file name.  As
 > a result of this change, b10-auth no longer exits upon encountering
 > such errors.
 > }}}
 >

 ok

 > Bigger issues:
 > - The current error handling in ConfigurableClientList::configure
 >   seems quite ad hoc and policyless.  It's not clear in which case we
 >   (intend to) abort the configuration, accept partial error in a
 >   single data source, accept total error in a single data source, etc.
 >   What I really would like to see is a general policy about these,
 >   introduce higher level exceptions to represent the concept, and let
 >   the underlying data source (client) implementation throw them
 >   according to the severity of specific errors.  I think it's also
 >   related to your second "addition" thing.

 Probably, but I do hope we can get something easier to edit than
 command/config in auth :)

 > - Related to the previous point, catching `MasterLoadError` seems
 >   fragile, because it's not even clear in client_list that the data
 >   source uses masterLoad() for loading the zone.

 Indeed, however, in this specific case I didn't think too much of it
 because 'masterfiles' is already a special case in this implementation,
 with its own handling.

 > - Also related, I guess it's probably too strict to refuse loading
 >   completely when we see an out-of-zone name.  For example, BIND 9's
 >   zone parser warns about it and continue parsing.  We should probably
 >   do the same.
 >

 in that case we'll have to differentiate either with separate exceptions;
 or come up with a different way to propagate problems.

 out-of-zone data is the one that triggered this ticket, but i think most
 of the other ones are more severe (though none should have the old
 behaviour of completely crapping out on startup).

 > But I don't request addressing these in this ticket.  Some of them are
 > clearly beyond the scope of this ticket, and some may be too big
 > anyway, and we'll substantially change the loader and parser pretty
 > soon, so making big changes in this area immaturely wouldn't be a good
 > idea.  It would still be a good idea to make some comments about these
 > points so we can remember it later.
 >

 ack.

 > Replying to [comment:3 jelte]:
 >
 > > - the config update hack in auth's main.cc (line 205) appears to be
 unnecessary; an update is immediately called anyway.
 >
 > Is this ("unnecessary") true?  My understanding is that even if the
 > update callback is called it doesn't provide the default
 > configuration.  Has it been fixed, or is it different for remote
 > configuration?
 >

 ah, indeed, we still need to fix that.

 > > - a potentially bigger problem that is shown by this ticket is that
 reconfigure() in the DataSourceConfigurator can throw, which it shouldn't,
 as it is used as a remote config callback; we should probably catch and
 log std::exception and maybe even (...) as well (in
 reconfigureInternal()).
 >
 > See above.  I agree we should revisit the error handling policy (and
 > implementation).  But as for "it is used as a remote config callback",
 > I guess it's part of another big problem of what we should do if
 > remote config update fails in general.  Also, I think std or even
 > unexpected type of exceptions should actually kill (maybe gracefully)
 > the process just like in other cases.  We generally consider these
 > an unrecoverable error, and rather than trying to deal with it, just
 > let the process die and restart.

 well, there's two potential problems (which is why the set-remote-config-
 handler-call documentation specifies the handler should not throw);
 - the handler cannot be trusted to immediately report errors back to the
 user since the module may not be running, and even if we had 'offline'
 config, we still wouldn't know which modules might use it
 - more importantly, especially in this scenario, is that if the module is
 of kind 'needed', and there is an error in the config which kills the
 module, you can't start bind10 again and fix it; it will immediately exit
 during initialization.

 So there may be classes of errors that should do so; i certainly don't
 think zone-loading is one of them :)

-- 
Ticket URL: <http://bind10.isc.org/ticket/2178#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list