BIND 10 #2157: Create an interface to pass statistics counters in Auth module

Thu Nov 29 04:23:03 UTC 2012

#2157: Create an interface to pass statistics counters in Auth module
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  y-aharen                           |                Status:  reviewing
                       Type:         |             Milestone:
  enhancement                        |  Sprint-20121204
                   Priority:         |            Resolution:
  medium                             |             Sensitive:  0
                  Component:         |           Sub-Project:  DNS
  b10-auth                           |  Estimated Difficulty:  8
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Still working on it, but dumping another round of intermediate comments.

 Replying to [comment:35 y-aharen]:

 > > '''gen-statistics_items.py'''

 > > - it misses tests.  is it okay?  for example, can we be sure it's
 robust
 > >   against broken input? (gen-rdatacode is a bad precedent, but let's
 > >   not just use it as an excuse for making another bad practice).
 > I think it can be covered with C++ compiler and the other tests since
 > the output of the script becomes the input of C++ parser and/or the
 other
 > tests.

 It doesn't only produce C++ code, and, in fact, there seem to be some
 errors in XML output (see specific comments).  It's not good, but
 we've spent too much time on this ticket, so we should probably live
 with(out) it...but, as a general matter, please do not easily conclude
 "we don't need a test for this".  It often turns out to be wrong.

 > > - This description doesn't seem to match the implementation.  It's
 > >   counted for all versions of EDNS (what does BIND 9 do?).
 > > {{{
 > >     edns0           QR_REQUEST_EDNS0        Number of requests with
 EDNS(0) received by the b10-auth server.
 > > }}}
 > The implementation was wrong and I fixed it.
 > In BIND 9, it only counts for EDNS version 0 requests.

 On a closer look, I guess we should actually count all versions of
 EDNS.  See the code comment.

 > > - this description also applies to the "referral" case.
 > > {{{
 > > qrynxrrset  QR_QRYNXRRSET                   Number of queries received
 by the b10-auth server resulted in NOERROR but the number of answer RR ==
 0.
 > > }}}
 > Corrected as described above.

 I don't see any change on this.

 > > - why does this explicitly say "authoritative"?
 > > {{{
 > > authqryrej  QR_QRYREJECT                    Number of authoritative
 queries rejected by the b10-auth server.
 > > }}}
 > I haven't checked RD flag is off. I added a check and it's
 "authoritative".

 I don't know if we "resolve" this point by adding the check.  In that
 sense, why don't we count number of rejects for queries with RD being
 on?  BIND 9 counts both, but that's because it's a hybrid server.  In
 the case of b10-auth, I'd simply count all queries resulting in
 REFUSED in a single counter.

 For that matter, it may be interesting to count number of queries with
 RD on for b10-auth.

 '''statistics.h'''

 - this comment for `Counters::inc` doesn't seem to be correct
 {{{#!cpp
     /// This method is mostly exception free. But it may still throw a
     /// standard exception if memory allocation fails inside the method.
 }}}

 '''auth_srv.cc'''

 - now that the `setRequestEDNS0` implies it's EDNS version 0, I
   suspect we also need to check its version:
 {{{#!cpp
 -                impl_->stats_attrs_.setQueryEDNS(true,
 -                                                 edns->getVersion() !=
 0);
 +            impl_->stats_attrs_.setRequestEDNS0(true);
 }}}
   note that previously it was called `setQueryEDNS` (no '0') so it
   looked awkward to check the version.  note also that libdns++
   doesn't ensure it's always version 0; it just throws if the version
   number is "unknown".  right now it just happens to mean 0 and only
   0, it's not guaranteed in terms of API.  Come to think of it, I'm
   not sure if we want to differentiate EDNS0 and other versions within
   b10-auth; the design decision of libdns++ is to make the user of it
   agnostic about which versions of EDNS0 are supported.  So, in auth
   statistics it seems to make more sense to handle any EDNS versions
   (but in reality it always means version 0, so behavior-wise it's
   still compatible with BIND 9).

 > > - Why is `stats_attrs` defined as part of `AuthSrvImpl`?

 > It's for optimization similar to isc::auth::Query. It improved the
 > overhead to around 3% with other optimizations as mentioned in
 bind10-dev
 > list.
 > However, using std::bitset saves some memory, so I updated the branch.

 Does that particular change (defining it in `AuthSrvImpl`) really
 contribute to the performance improvement?  It's hard to believe to
 me, especially with the latest branch version: operational-wise the
 overhead should basically be identical: if we define it as temporary,
 the overhead is to call the constructor (the destructor is no-op, so
 the compiler will optimize it out); if we define it in the class,
 the overhead is to call reset() at the end.  And these methods do the
 same thing.  I wouldn't even be surprised if the temporary version is
 faster, because the struct is not pretty compact and it's more likely
 contained in the data cache with other local variables (although I
 also suspect such effect is marginal in the context of the total
 response performance).

 Assuming the performance difference is marginal, my main concern with
 the current approach is that it introduces the additional state.  So
 we need to ensure it's always cleaned up for every query (or always
 re-initialized at the start).  The current implementation does that
 by calling reset() at the end of the processing, but since the code is
 complicated it's not immediately clear if it covers all cases (isn't
 there really missing cleanup case?).  Even if it's currently correct,
 this approach is susceptible to future code changes; if and when we
 add an "intermediate exit", it could then be a point of missing cleanup.

 So, unless this particular change really improves the performance, I'd
 generally avoid this approach.

 '''gen-statisticsitems.py'''

 - this is necessary?
 {{{#!python
     global item_list
 }}}
   (there are several cases of this).  according to the way item_list
   is used, the `global` declaration doesn't seem to be necessary to me.
 - using `with` is probably a bit better than explicit open-close.
 {{{#!python
     #item_definition = open(items_definition_file, 'r')
     with open(items_definition_file, 'r') as item_definition:
         # (no explicit close() in this block)
 }}}
   Same for other cases.
 - import_definitions: I'm afraid it's not obvious what this does and
   needs a comment:
 {{{#!python
         if element[0] == '':
             element.pop(0)
 }}}
 - it's not obvious why using '+' here.  Please add comment.
 {{{#!python
         stats_pre_json = \
             json.loads(stats_pre.read().replace('@@LOCAL'+'STATEDIR@@',
                                                 localstatedir))
 }}}
 - generate_specfile: this doesn't seem to be correct:
 {{{
     This method reads the content of skeleton and replaces
     <!-- ### STATISTICS DATA PLACEHOLDER ### --> with statistics items
     definition.
 }}}
 - generate_docfile: its convert_list isn't trivial.  Please add more
   comment/doc.
 - generate_docfile: at least for me it produces garbage:
 {{{
 b'<variablelist><varlistentry><term>request.v4</term>...'
 }}}
   This also suggests why we need a test.
 - generate_docfile: also, it's better if the output is more readable
   with newlines.
 - generate_docfile: what's the purpose of this line?
 {{{#!python
                         para_tail = command
 }}}
 - generate_docfile: with this implementation it will have a redundant
   space at the end of simpara.
 - generate_cxx: if I understand it correctly, its convert_list()
   doesn't work correctly only up to 1 recursion.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2157#comment:37>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development