BIND 10 #2157: Create an interface to pass statistics counters in Auth module

BIND 10 Development do-not-reply at isc.org
Thu Nov 22 07:22:12 UTC 2012


#2157: Create an interface to pass statistics counters in Auth module
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  y-aharen                           |                Status:  reviewing
                       Type:         |             Milestone:
  enhancement                        |  Sprint-20121204
                   Priority:         |            Resolution:
  medium                             |             Sensitive:  0
                  Component:         |           Sub-Project:  DNS
  b10-auth                           |  Estimated Difficulty:  8
                   Keywords:         |           Total Hours:  0
            Defect Severity:  N/A    |
Feature Depending on Ticket:         |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by y-aharen):

 * owner:  y-aharen => jinmei


Comment:

 Hello,

 I made a small change not mentioned in the review, which fixes
 @@LOCALSTATEDIR@@
 is not replaced in b10-auth.xml.

 Replying to [comment:33 jinmei]:
 > Replying to [comment:32 y-aharen]:
 >
 > > > '''general'''
 > > >
 > > > - Isn't it possible to unify the definition of the counter
 information
 > > >   (things like QR_REQUEST_IPV4, QR_OPCODE_QUERY, etc), e.g., in some
 > > >   separate file and auto generate others?  Right now, we have the
 same
 > > >   kind of information at least in 3 different places: auth.spec,
 > > >   statistics_items.h, and the man page.  Even excluding RR types,
 that
 > > >   can easily cause maintenance problems.
 > > Now they are automatically generated in compile time.
 >
 > Okay, thanks.  The new approach looks much cleaner and more
 > understandable.
 >
 > > > - I'd like to avoid relying on a magic keyword like `_SERVER_`.  In
 > > >   fact, there could even a zone of that name.
 > > The other modules such as Xfrout has same structure. If it's not
 suitable,
 > > wiki:StatisticsItems and the other modules should also be updated. I
 think
 > > it is out of scope of this ticket, can't it be in the separate one?
 >
 > Works for me, but please don't forget making it and having the
 > discussion.
 >
 > > > - if possible I'd like to make one clarification before merging this
 > > >   branch.  That is, the BIND 9 version of statistics counters are
 > > >   confusing in that the sum of any of QRYxxx does not match the
 number
 > > >   of incoming queries.  I'd clarify the counters so that at least
 the
 > > >   sume of some subset of QRYxxx counters equals to QR_OPCODE_QUERY.
 > > I agree the meaning of the counters are confusing. However, I think
 some of
 > > BIND 9 users rely on the counters and are not willing to change them.
 > > I propose to leave them in BIND 10 for compatibility. If you don't
 like to
 > > leave them, I'd like to just remove them from the branch and the
 discussion
 > > for QR_QRYxxx should be in another ticket.
 >
 > For the purpose of this ticket, I'm okay with this approach.  But I
 > think we should have this discussion sooner than later.  Also, the
 > current implementation actually doesn't seem to be BIND 9-compatible
 > in some points.  See code comments below for more details.
 >
 > Now, specific comments on the implementation:
 >
 > '''general'''
 >
 > - don't we need a changelog for this change?  was it planned in other
 >   related ticket?
 ChangeLog is in this ticket including #2154, #2155. I added it.

 > - not sure if it's an issue of this branch, but distcheck fails for
 >   me.  Please check. (it's important to confirm ditcheck works for
 >   branches introducing many new files like this one)
 distcheck succeeds in my environment. Could you please let me show
 which test fails? Please also be advised that distcheck fails if -j option
 is given to make command, even in the lastest master (git 3637880).

 > - terminology consistency: the policy of "request" and "query" doesn't
 >   seem to be consistent.  For example: it begins with "query" and then
 >   has a comment using "request".  There are other such cases
 >   throughout the code.  These should be consistent.
 > {{{#!cpp
 > class QRAttributes {
 > /// \brief Query/Response attributes for statistics.
 > ...
 > friend class Counters;
 > private:
 >     // request attributes
 > }}}
 >   In the BIND 9 statistics, I intentionally used "query" for standard
 >   queries (opcode 0) and used "request" for the general case
 >   (including the standard queries).
 OK, updated the branch following the naming policy.

 > - since we are not sure how we support per zone statistics yet, I'd
 >   suggest simplify this branch excluding per-zone related code.
 OK, I removed them.

 > - if I don't miss something, there are no tests for newly introduced
 >   counters in the context of request processing: something like
 >   `queryCounterUDPNormal` when the query (request) is with EDNS.
 I added some testcases.

 > '''gen-rdatacode.py.in'''
 >
 > Why was this file modified?
 It's my mistake. Reverted my change.

 > '''auth_srv.cc'''
 >
 > - Why is `stats_attrs` defined as part of `AuthSrvImpl`?  I don't see
 >   any need for it because it's temporary by nature.  It seems to be
 >   even more advantageous to define it as a local variable of
 >   processMessage (and initialize it in a constructor); it will then be
 >   more likely to be in the same data cache as other local variables.
 >
 > - related to the previous point, I wonder whether we can use bitmaps
 >   for some of the boolean members of `QRAttributes`.  If these are
 >   supposed to default to false, we can initialize all of them just by
 >   setting a single or a few unsigned integers to 0.
 It's for optimization similar to isc::auth::Query. It improved the
 overhead to around 3% with other optimizations as mentioned in bind10-dev
 list.
 However, using std::bitset saves some memory, so I updated the branch.

 > - also related, I'd remove `zone_origin_` from `QRAttributes` for now.
 >   it's currently unused, and, in any event, using a string to
 >   represent a zone (domain) name wouldn't be the best idea.
 OK, I removed the member.

 > - not specific to this branch, but some of the statistics related
 >   comments are too obvious and seem to make the code unnecessarily
 >   long and unreadable.  Those include:
 > {{{#!cpp
 >         // statistics: check TSIG attributes
 > ...
 >         // statistics: check EDNS
 > ...
 >         // statistics: check OpCode
 > }}}
 >   I'd simply remove them.
 OK, I removed them.

 > - I don't see the need for the separate block for EDNS:
 > {{{#!cpp
 >         {
 >             ConstEDNSPtr edns = message.getEDNS();
 >             if (edns != NULL) {
 >                 impl_->stats_attrs_.setQueryEDNS(true,
 >                                                  edns->getVersion() !=
 0);
 >
 impl_->stats_attrs_.setQueryDO(edns->getDNSSECAwareness());
 >             }
 >         }
 > }}}
 >   and, it'd be a waste if we release the pointer at the of the block;
 >   we should be able to pass it to processNormalQuery().  Also related,
 >   I'd change the `if` condition as follows:
 > {{{#!cpp
 >             if (edns) {
 > }}}
 >   i.e., implicitly converting it to a boolean.  Using `operator!=` with
 >   a smart pointer would involve more instructions and could be
 >   unnecessarily expensive.
 OK, updated the branch as you suggested.

 > - isn't it slightly better to simply pass `done` from `resumeServer()`
 >   to `Counters::inc()`?  Then we can save one method call
 (`answerWasSent()`)
 >   and the corresponding member from `QRAttributes`.
 OK, updated the branch as you suggested.

 > '''gen-statistics_items.py'''
 >
 > - very minor, but the mixture of dash (-) and underscore (_) in the
 >   file name looks awkward.
 OK, I renamed it.

 > - it generally lacks documentation.  Please at least describe a brief
 >   description and type/meaning of the parameters for each function.
 > - also related, I'd like to see a description of the format of the
 >   input file.
 OK, I added brief description and format of the input file.

 > - it misses tests.  is it okay?  for example, can we be sure it's robust
 >   against broken input? (gen-rdatacode is a bad precedent, but let's
 >   not just use it as an excuse for making another bad practice).
 I think it can be covered with C++ compiler and the other tests since
 the output of the script becomes the input of C++ parser and/or the other
 tests.

 > - some existing comments seem a naive copy of gen-rdatacode and
 >   not make sense in this context.  Example:
 > {{{#!python
 >         sys.stderr.write('Code generation failed due to exception: %s\n'
 %
 >                 sys.exc_info()[1])
 > }}}
 >   (why "code generation"?).
 OK, I updated the comments.

 > - I needed this change to make it work:
 > {{{#!diff
 > -            ElementTree.tostring(variable_tree)))
 > +            str(ElementTree.tostring(variable_tree))))
 > }}}
 > - Note: I've not fully reviewed this file due to these points.
 I fixed the problem with your suggestion. It works both Python 3.1
 and Python 3.2.

 > '''statistics_items.h.pre'''
 >
 > - It's not immediately clear why we need to have it as a separate file
 >   because it's basically only needed by statistics.cc.  If it's for
 >   testing purposes, please describe it in the file.
 I added a note.

 > - I'd like to see structure description for `CounterTypeTree`.
 I added a description.

 > - `CounterTypeTree`: is "tree" an appropriate name: this structure
 >   rather seems to form a single-link list. (and this structure itself
 >   is an entry of the list, not the list itself).  hmm, after
 >   understanding how these are defined and used, it seems to be more
 >   appropriate to call it, e.g, `CounterSpec`, and rename `sub_tree`
 >   (e.g.) `sub_counters`.
 OK, I renamed them.

 > - why is counter_id a (signed) int?  can it be negative?  Hmm, I see
 >   -1 is used as a magic number.  Magic numbers aren't good.  I'd
 >   suggest introducing a constant, and also describing that we use
 >   negative values for some special cases.
 The type of counter_id is the type of enum value; it's signed int.
 I introduced a constant and added description.

 > '''statistics.h,cc'''
 >
 > - `QRAttributes`: the position of the class description looks
 >   awkward.  Does doxygen handle it correctly?
 I moved it to correct position.

 > - `QRAttributes`: I was not sure how it could be justified to specify
 >   the `Counters` class as a friend of it.  In general, I'd be against
 >   of introducing a friend if it's just for convenience; it's often an
 >   indication that we don't fully consider class designs.  Is there any
 >   essential reason that `Counters` needs to be a friend of
 >   `QRAttributes` other than we can skip providing getters?  See also
 >   the above discussions on bool vs bitmaps.  friend class makes it
 >   difficult to introduce changes to the internal representation if we
 >   want to make such changes later that could be otherwise completely
 >   within the `QRAttributes`.
 OK, I added getter functions and make it independent class.

 > - mostly a matter of taste, but you might want to define some very
 >   simple methods within the class definition, e.g.: instead of doing
 >   this:
 > {{{#!cpp
 > class QRAttributes {
 >     inline void setQueryOpCode(const int opcode);
 > };
 > }}}
 >   (and define `setQueryOpCode` separately), do this:
 > {{{#!cpp
 > class QRAttributes {
 >     void setQueryOpCode(const int opcode) { req_opcode_ = opcode; }
 > };
 > }}}
 >   (you don't need to specify `inline` as it's implicated in this style.)
 cherry-picked from trac2155.

 > - `QRAttributes`: you don't have to define the trivial destructor
 >   explicitly.
 OK, I removed empty destructors.

 > - `QRAttributes::setQueryOpCode` why does it take a (signed) integer?
 >   what if it's a bogus value like -1 or 16?  isn't it better to take
 >   an `Opcode` object itself?  In any case the same question applies
 >   to req_opcode_: why it's an int?
 Passing isc::dns::OpCode introduces more dependencies in stats code,
 so I would avoid it. However, it should be at least unsigned. Should we
 introduce an assertion check?

 > - we need a blank line before `\brief` for readability (see the new
 >   rule of the style guideline).  It's trivial so I made the change
 OK, thank you.

 > - `QRAttributes` constructor: you don't have to explicitly initialize
 >   zone_origin_ in this case (but I'd rather suggest removing it for
 >   now).
 cherry-picked from trac2155.

 > - `QRAttributes::setQuerySig()` this interface seems to be a bit ad
 >   hoc.  how would it be used if and when we support SIG(0)?
 Sorry if I misunderstood your question, is_sig0 will be used for SIG(0).

 > - setQueryIPVersion/setQueryTransportProtocol: I'd like to see
 >   expected values for these in the documentation.  And if unexpected
 >   values are given?
 I added notes that describes expected values. If unexpected value is
 given,
 the counter will not be incremented.

 > - reset() won't be necessary if we define it as a method's local
 >   variable (see above).
 If it's not good to define MessageAttributes as class member, I'll define
 it as a method's local variable and remove reset().

 > - `ItemTreeType`: I'm afraid this typedef could be confusing because
 >   it's not obvious from the typedef name that this is actually a
 >   (smart) pointer.
 I renamed it to ItemTreePtr.

 > - `ItemTreeType`: I suspect the diagram won't be formatted cleanly in
 >   doxygen. you'll need verbatim markup or something.
 OK, I used \verbatim for markup.

 > - Counters destructor doesn't have be declared or defined explicitly
 >   in this case.
 OK, I removed empty destructors.

 > - Counters description: the class description doesn't seem to match
 >   the latest implementation.
 I updated the description.

 > - fillNodes: maybe a matter of taste, but I guess it might be better
 >   to have it return the map (pointer) (which is called "tree" here),
 >   rather than having it fill in the passed one (`trees` parameter).
 >   If you see this code first time, its' not immediately clear if
 >   `trees` is supposed to be empty.
 I intentionally passing a tree to be filled as an argument to be used
 like the following:
 {{{#!cpp
 1 fillNodes(a_counter, ItemNameMap, item_tree);
 2 fillNodes(another_counter, AnotherItemNameMap, item_tree);
 }}}
 This enables to fill two or more set of counters into a tree. It will
 be useful in case we have more set of counters in Auth module.
 Anyway, I added notes for the function. It doesn't appear in doxygen
 since it's in the anonymous namespace.

 > - I'd explicitly include statistics/counter.h from statistics.cc
 >   as the cc file directly uses its definition (`Counter`).
 The header file must be included from statistics.h as a definition
 for class isc::auth::statistics::Counters requires
 isc::statistics::Counter.
 {{{#!cpp
 230 class Counters : boost::noncopyable {
 231 private:
 232     // counter for DNS message attributes
 233     isc::statistics::Counter server_msg_counter_;
 }}}

 > - why does `Counter` store the value as int?  Do we expect negative
 >   values?  And are int (often 32-bit) enough (I don't think so btw)?
 isc::statistics::Counter stores the value as 'unsigned' int. We don't
 expect negative values. However, isc::data::IntElement only accepts signed
 values (long) as I read the document for the class. I assume signed long
 in
 most of environments can represent maximum value of unsigned int but
 cannot of unsigned long. Also, wiki:StatsModule implicitly assumes that
 counters are 32-bit according to MIB structure.

 > - related: I guess fillNodes() converts the counter value to `long
 >   int` to match the corresponding `ElementPtr` constructor, but I
 >   suspect we actually need to handle unsigned 64-bit integers.
 It is due to a limitation of isc::data::IntElement.

 > - `QRCounterRequest` etc: per naming convention, these should be
 >   something like qr_counter_request, etc.  Also, things like these
 >   should be defined in an unnamed namespace.
 OK, I modified the branch.

 > - `QROpCodeToQRCounterType`: doesn't follow naming convention.
 > - same comments apply to `QRRCodeToQRCounterType`.
 OK, I modified the branch.

 > - `QROpCodeToQRCounterType`: why hardcoding 16?  It seems to be a
 >   fragile practice.
 It was to avoid misedit. Instead of hardcoding, I added a note.

 > - `Counters::incResponse`: I suggest directly converting
 >   `ConstEDNSPtr` to bool (see above):
 > {{{#!cpp
 >     ConstEDNSPtr response_edns = response.getEDNS();
 >     if (response_edns && response_edns->getVersion() == 0) {
 >         server_qr_counter_.inc(QR_RESPONSE_EDNS0);
 >     }
 > }}}
 >   also, why do we bother to check the version here?
 removed '== NULL'. I thought EDNS version check is required
 because the counter is for EDNS version 0 and getVersion() could return
 other than 0, but it's not. I corrected the condition.

 > - is this assumption correct?
 > {{{#!cpp
 >         // assume response is TSIG signed if request is TSIG signed
 > }}}
 I think yes, according to auth_srv.cc:
 {{{#!cpp
 668     if (tsig_context.get() != NULL) {
 669         message.toWire(renderer_, *tsig_context);
 670     } else {
 671         message.toWire(renderer_);
 672     }
 }}}

 > - about rcode: it makes me nervous regarding out-of-range access for
 >   the array, due to the direct use of magic numbers.  I'd do this:
 >   define a constant variable to indicate the number of array entries:
 > {{{#!cpp
 > namespace {
 > const int qr_opcode_to_qrcounter[] = {
 > ...
 > };
 > const size_t num_qr_opcode_to_qrcounter =
 >     sizeof(qr_opcode_to_qrcounter) /
 >   sizeof(qr_opcode_to_qrcounter[0]);
 > }
 > }}}
 >   and use it for the range check.  also, rcode_type can be const:
 > {{{#!cpp
 >     const size_t rcode_type =
 >         rcode < num_qr_opcode_to_qrcounter ?
 >         qr_opcode_to_qrcounter[rcode] : QR_RCODE_OTHER;
 > }}}
 OK, I updated the branch as you've suggested.

 > - In `Counters::incResponse`, `QR_QRYSUCCESS` etc, would only be
 >   incremented for standard queries in BIND 9.  So if the purpose is to
 >   be compatible with BIND 9, this code isn't.  On the other hand, if
 >   we don't care about the compatibility, I'd wonder why we bother to
 >   count `QR_QRYREJECT`, which would be redundant due to the rcode
 >   counters.  Likewise, I'd wonder why we don't count NXDOMAIN, etc.
 I updated the branch to count only for OpCode=QUERY. Thank you.

 > '''statistics_qr_items.def'''
 >
 > - This description doesn't seem to match the implementation.  It's
 >   counted for all versions of EDNS (what does BIND 9 do?).
 > {{{
 >       edns0           QR_REQUEST_EDNS0        Number of requests with
 EDNS(0) received by the b10-auth server.
 > }}}
 The implementation was wrong and I fixed it.
 In BIND 9, it only counts for EDNS version 0 requests.

 > - I guess "DO bit" needs some more explanation:
 > {{{
 >       dnssec_ok       QR_REQUEST_DNSSEC_OK    Number of requests with DO
 bit received by the b10-auth server.
 > }}}
 Is it enough?
 {{{
 Number of requests with "DNSSEC OK" (DO) bit was set received by the
 b10-auth server.
 }}}

 > - as already discussed, this description doesn't seem to be correct:
 > {{{
 > qrysuccess    QR_QRYSUCCESS                   Number of queries received
 by the b10-auth server resulted in rcode = NOERROR and the number of
 answer RR >= 1.
 > }}}
 >   (depending on the definition of "queries", though)
 Corrected as described above.

 > - this description also applies to the "referral" case.
 > {{{
 > qrynxrrset    QR_QRYNXRRSET                   Number of queries received
 by the b10-auth server resulted in NOERROR but the number of answer RR ==
 0.
 > }}}
 Corrected as described above.

 > - why does this explicitly say "authoritative"?
 > {{{
 > authqryrej    QR_QRYREJECT                    Number of authoritative
 queries rejected by the b10-auth server.
 > }}}
 I haven't checked RD flag is off. I added a check and it's
 "authoritative".

 > - inconsistent use of query vs request.  and in this case I guess
 >   "request" is more correct.
 > {{{
 >       noerror         QR_RCODE_NOERROR        Number of queries received
 by the b10-auth server resulted in RCODE = 0 (NoError).
 > }}}
 >   same for others.
 Yes, they should be request. I fixed it.

 > - I don't understand why we need to define rcode = 17 this way:
 > {{{#!cpp
 >       badsigvers      QR_RCODE_BADSIGVERS     Number of queries received
 by the b10-auth server resulted in RCODE = 16 (BADVERS, BADSIG).
 > }}}
 >   BADSIG is only used in a TSIG RR and shouldn't appear in the header
 >   rcode field.  Same comment applies to rcode=18-21 (some are specific
 >   for TKEY RR).
 I misunderstood these Rcodes. I renamed badsigvers to badsig and removed
 rcode=18-22; they're not currently implemented.

 > '''statistics_unittest.cc'''
 > - what's the purpose of the return value of
 >   `checkCountersAllZeroExcept`?  btw: there are missing ()'s for return.
 cherry-picked from trac2155.

 > - overall, the tests don't seem to be comprehensive.  it misses many
 >   cases of the parameters (how about the case for rcode other than
 >   REFUSED? etc).
 OK, I added testcases for isc::auth::statistics::Counters.

 Thanks,

-- 
Ticket URL: <http://bind10.isc.org/ticket/2157#comment:35>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list