BIND 10 #347: Implement query counters in auth module

Tue Dec 21 00:39:34 UTC 2010

#347: Implement query counters in auth module
------------------------------+---------------------------------------------
      Reporter:  naokikambe   |        Owner:  y-aharen             
          Type:  enhancement  |       Status:  reviewing            
      Priority:  major        |    Milestone:  y2 12 month milestone
     Component:  b10-auth     |   Resolution:                       
      Keywords:               |    Sensitive:  0                    
Estimatedhours:  0.0          |        Hours:  0                    
      Billable:  1            |   Totalhours:  0                    
      Internal:  0            |  
------------------------------+---------------------------------------------
Changes (by jinmei):

  * owner:  jinmei => y-aharen

Comment:

 Replying to [comment:10 y-aharen]:

 Overall, the revised branch looks much better.  I have some non
 trivial comments though, which are below.

 '''general comment'''
  - I made some trivial suggested changes (r3904).  Please check them
    out (and override any of them if necessary)
  - While I personally prefer "statistics" over "stats", consistency is
    even more important.  Right now there is a mixture of "xxxStatsyyy"
    and "xxxStatisticsyyy" (e.g. setStatsSession() vs
    submitStatistics()), which is not good.  If we name something
    "xxxStatistics", everything else should be named with "Statistics"
    (not stats); if we name it "xxxStats", others should also be named
    with "Stats".  The choice is ultimately yours, but it should be
    consistent.  I've fixed one obvious issue with this regard directly
    in the branch: header guard definition (__STAT(ISTIC)S_H).

 '''statistics.h'''
  - now there's no need to include boost/function.hpp.  I've removed
    it.
  - as for the class name:
 > "Counter" class holds a set of query counters. Thus, I renamed it to
 > "QueryCounters". Is it OK? Currently it holds only "UDP" and "TCP"
 > type, but we may extend other types like "Notify", "A/Ixfr" type and
 > so on in the future. I described it as a comment in the code.
 >
 My concern is that with the intent of extension we'll soon encounter
 the need for renaming the class.  Note that notify/ixfr/update are not
 even "queries" (to be very accurate about terminology); we may want to
 have other statistics counters that may not be related to DNS protocol
 handling (e.g, we may want to have a counter about the number of
 reloads).  I don't know the future plans about these possibilities,
 but unless we want to define separate classes for these different
 types of statistics parameters, I'd choose a more generic name such as
 AuthCounters (but note also that as I pointed out in my previous
 comment it may not even be specific to the authoritative server; we'll
 naturally consider having the same framework for b10-recurse, for
 example.  So, eventually, we may want to have a separate library
 module src/lib/statistics or something like that, where we define a
 most generic "Counters" class that can be used for any kinds of
 counter based statistics).

 But at the same time, there's a risk of premature overabstraction, so
 it's a tradeoff issue.  The choice is basically up to you, but if I
 were the principal author, I'd begin with "AuthCounters" with a note
 that we may eventually want to make it a separate library with a
 higher abstraction level (with the cost of renaming classes and its
 overhead).

 '''statistics.cc'''

  - getCounters() (effectively) exposes an internal implementation
    details (i.e., that the counters are stored in std::vector), which
    is not a good practice.  I'd prefer passing a counter type and
    retrieving that specific counter.  BTW: why "shouldn't this be
    called except from tests" as documented in the header file?  It may
    not be of much use in general, but I don't see the reason for
    prohibiting it.  And, in any case, restricting usage by comment is
    not a reliable way to do so.
  - QueryCountersImpl::inc(): is there any reason for converting
    std::out_of_range to InvalidParameter?  On one hand that's
    generally a good practice, but we basically allow standard
    exceptions to be propagated through our public interfaces (unlike
    boost exceptions, which we generally try to hide), so we don't have
    to do anything special here.  If there's a good reason to catch
    (and convert) the exception internally, we should catch it by
    reference, not as an object.

 > > '''auth_srv.cc'''

  - AuthSrv::processMessage
 > >  - AuthSrv::processMessage(): in which case we can see getProtocol()
 > >    be neither UDP nor TCP?  In general, I don't like to handle an
 > >    impossible case this way.  I'd either throw an exception or even
 > >    assert() here.
 > I agree with you. I modified to throw an exception.
 >
 Okay, but the exception case isn't tested.  Please also add
 documentation about the exception in auth_srv.h (to make it better
 gradually).

  - AuthSrv::processMessage another point: this method handles all
    incoming requests including notify and IXFR, not just normal
    queries.  So, technically, incrementing a "query" counter here is
    not correct.  See also the discussion about class naming above.

  - AuthSrv::submitStatistics: this should be a const member function.

 '''asio_link.h'''
  - about the IntervalTimer class description: I'm not sure whether we
    need this note:
 {{{
 /// This class is mainly designed to use for calling
 /// \c QueryCounters::submitStatistics() periodically.
 }}}
   What's the purpose of this note?  The interface of interval timer
   seems quite generic and I don't see the need for mentioning the
   "main purpose" here.

  - IntervalTimer constructor
   - We should document that "cbfunc" shouldn't be an empty functor.
   - Can it throw asio::system_error?  I was not sure how that could
     happen, but if it really happens:
     - we should catch the exception internally and convert it to our
       own one (we need to hide any asio specific definitions, which is
       the purpose of this wrapper in the first place)
     - we should cover that case in tests.
  - IntervalTimer destructor: my previous question doesn't seem to be
   addressed in the documentation: "what happens if we destruct
 !IntervalTimer before starting a timer?"
  - setupTimer(): like the constructor, asio::system_error shouldn't be
 exposed.

 '''asio_link.cc'''
  - IntervalTimerImpl::callback()
 > OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
 > its arguments.

 I guess you mean "no longer exits (or asserts)"?  BTW with the check
 in setupTimer() I'm okay with the assert() (or may even be a good
 practice, but it's up to you).

 '''asio_link_unittest.cc'''
  - as a general comment, it looks like the added tests make
    ASIOLinkTest unnecessarily big(ger).  ASIOLinkTest is already
    pretty complicated due to perform some tricky tests using actual
    socket I/O, while the timer tests don't need anything of these.  It
    would be much better to define a different test fixture (say,
    IntervalTimerTest) and enclose all timer related stuff there.
  - there's no need to construct an IntervalTimer::Callback object
    explicitly thanks to the magic of boost::function.  e.g., instead
    of:
 {{{
                 timer_.setupTimer(IntervalTimer::Callback(
                                       TimerCallBack(test_obj_)), 1);
 }}}
   we could say:
 {{{
                 timer_.setupTimer(TimerCallBack(test_obj_), 1);
 }}}
   This will help make the code simpler and more readable.

  - ASIOLinkTest::startIntervalTimer()
 > OK. Almost all of IntervalTimers are now allocated statically, except in
 > destructIntervalTimer. In the test code we have to delete it.
 > I enclosed it by ASSERT_NO_THROW, is that OK?
 >
 It doesn't address the concern about the exception safety.  It's not
 about the case where newing the timer throws an exception.  It's about
 an exception triggered after the new before the test terminates.  One
 way to make it 100% correct would be to use a smart pointer for the
 itimer, but that would be probably too much in this context; if the
 test throws unexpectedly, the whole program would soon exit anyway, so
 the perfect exception safety wouldn't be worth the code complexity.
 After making the itimers real object whenever possible, my suggestion
 is to ignore the exception safety problem with an explicit comment
 like: "this code isn't exception safe, but we'd rather keep the code
 simpler and more readable as this is only for tests and if it throws
 the program would immediately terminate anyway".

 BTW, whether or not adopting this suggestion, these _NO_THROW()s don't
 make sense anyway.  I'd remove them.
  - the additional tests introduce another 10 seconds of delay.  in
  general, imposing a long delay in unit tests is not a good practice
  as unit tests are supposed to be performed "instantly".  These types
  of delay have been introduced in our tests already (some of them were
  introduced by myself), and so it's not specific to the timer tests,
  but 10 seconds still look quite substantial.  I'd not necessarily
  request it be solved right now, but the timer should eventually have
  finer granularity for the timer duration and use shorter timer
  periods for tests.  For the moment I suggest adding some notes about
  this point in comments.

 '''statistics_unittest.cc'''
   - in QueryCountersTest() verbose_mode_ should be initialized to false
   explicitly.  Otherwise, it could cause noisy log messages during
   tests depending on the environment (it happened to me).
   - incrementUDPCounter: I'd test if the counter is initially 0, and
     if it's really incremented by one after counters.inc().  Same for
     incrementTCPCounter, AuthSrvTest::queryCounter{UDP,TCP}.

 '''ChangeLog'''
  - I couldn't play with the statistics using bindctl just from this
  description.  After understanding how it works, I'd suggest revising
  the text as follows:
 {{{
   TBD.  [func]          y-aharen
         src/bin/auth: Added a feature to count queries and send counter
         values to b10-stats periodically. To support it, added wrapping
         class of asio::deadline_timer to use as interval timer.
         The counters can be seen using the "Stats show" command from
         bindctl.  The result would look like:
           ... "auth.queries.tcp": 1, "auth.queries.udp": 1 ...
         Using the "Auth sendstats" command you can make b10-auth send
         the counters to b10-stats immediately.
         (Trac #347, svn rxxxx)
 }}}
   (I used b10-stats in this context instead of generic "statistics
   module" because it would be more helpful for the users)

-- 
Ticket URL: <http://bind10.isc.org/ticket/347#comment:12>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development