BIND 10 #2157: Create an interface to pass statistics counters in Auth module

BIND 10 Development do-not-reply at isc.org
Fri Dec 21 07:26:13 UTC 2012


#2157: Create an interface to pass statistics counters in Auth module
-------------------------------------+-------------------------------------
            Reporter:  y-aharen      |                        Owner:
                Type:  enhancement   |  jinmei
            Priority:  medium        |                       Status:
           Component:  b10-auth      |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130108
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  8             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:41 y-aharen]:

 > > > > - setQueryIPVersion/setQueryTransportProtocol: I'd like to see
 > > > >   expected values for these in the documentation.  And if
 unexpected
 > > > >   values are given?
 > > > I added notes that describes expected values. If unexpected value is
 given,
 > > > the counter will not be incremented.
 > >
 > > So we cannot detect a bug that passes an invalid value unless we
 > > explicitly write tests for such cases.  That doesn't seem to be a good
 > > interface.
 > I added enum definitions for IP versions and transport protocols
 > (git c216522).

 Regarding this, see code comments below.

 > > - `set/getRequestEDNS0` brief descriptions don't seem to make sense.
 > >   especially for set, and for get it's not attribute"s" (plural).
 > >   there are some other awkward brief descriptions.  Please re-read
 > >   and fix them.
 > OK, I fixed them (git dd62fb4).

 Still don't seem to make perfect sense...see code comments below.

 > > - `MessageAttributes` get method: `const` isn't necessary for the
 > >   parameters.
 > Sorry I couldn't get it: does it mean the return value of
 > getRequestOpCode() isn't necessary to be const?

 Oops, I should probably mean set methods like this:
 {{{#!cpp
     void setRequestEDNS0(const bool is_edns_0) {
         bit_attributes_[REQ_IS_EDNS_0] = is_edns_0;
     }
 }}}
 and actually, in these cases where they also have definitions, `const`
 is not necessarily redundant because it indicates the parameter
 is mutable throughout the method.  For such one-line function,
 however, that should be obvious at a glance, so I'd still omit it, but
 that's a matter of taste and I'd leave it to you.

 > > > > - This description doesn't seem to match the implementation.  It's
 > > > >   counted for all versions of EDNS (what does BIND 9 do?).
 > > > > {{{
 > > > >         edns0           QR_REQUEST_EDNS0        Number of requests
 with EDNS(0) received by the b10-auth server.
 > > > > }}}
 > > > The implementation was wrong and I fixed it.
 > > > In BIND 9, it only counts for EDNS version 0 requests.
 > >
 > > On a closer look, I guess we should actually count all versions of
 > > EDNS.  See the code comment.
 > May I do it in the separate ticket with updating the wiki page? I guess
 it
 > requires some discussion.

 That's fine, but please don't forget creating it.

 > > > > - this description also applies to the "referral" case.
 > > > > {{{
 > > > > qrynxrrset      QR_QRYNXRRSET                   Number of queries
 received by the b10-auth server resulted in NOERROR but the number of
 answer RR == 0.
 > > > > }}}
 > > > Corrected as described above.
 > >
 > > I don't see any change on this.
 > I misunderstood you've pointed it didn't counted not only for "queries"
 > (and it was fixed).
 > The description comes from BIND 9. Is it OK to mention AA bit? Or
 another
 > approach to describe it (and QryReferral)?

 Mostly forgot, but I guess I meant the description isn't accurate.  To
 be so, one way is to clarify it's the case where AA is set, but I
 don't much care how we resolve it as long as it's clear.

 > > > > - why does this explicitly say "authoritative"?
 > > > > {{{
 > > > > authqryrej      QR_QRYREJECT                    Number of
 authoritative queries rejected by the b10-auth server.
 > > > > }}}
 > > > I haven't checked RD flag is off. I added a check and it's
 "authoritative".
 > >
 > > I don't know if we "resolve" this point by adding the check.  In that
 > > sense, why don't we count number of rejects for queries with RD being
 > > on?  BIND 9 counts both, but that's because it's a hybrid server.  In
 > > the case of b10-auth, I'd simply count all queries resulting in
 > > REFUSED in a single counter.
 > >
 > > For that matter, it may be interesting to count number of queries with
 > > RD on for b10-auth.
 > May I do it in the separate ticket with updating the wiki page? I guess
 it
 > requires some discussion.

 That's fine, but please don't forget creating it and actually
 discussing it.
 >
 > > '''statistics.h'''
 > >
 > > - this comment for `Counters::inc` doesn't seem to be correct
 > > {{{#!cpp
 > >     /// This method is mostly exception free. But it may still throw a
 > >     /// standard exception if memory allocation fails inside the
 method.
 > > }}}
 > I think it's correct; it can throw std::bad_alloc from
 > isc::statistics::Counter::Counter().

 Where does it construct `isc::statistics::Counter::Counter()`?

 > > '''auth_srv.cc'''
 > >
 > > - now that the `setRequestEDNS0` implies it's EDNS version 0, I
 > >   suspect we also need to check its version:
 [...]
 > May I do it in the separate ticket with updating the wiki page? I guess
 it
 > requires some discussion.

 That's fine, but please don't forget creating it and actually
 discussing it.

 > > > > - Why is `stats_attrs` defined as part of `AuthSrvImpl`?
 > >
 > > > It's for optimization similar to isc::auth::Query. It improved the
 > > > overhead to around 3% with other optimizations as mentioned in
 bind10-dev
 > > > list.
 > > > However, using std::bitset saves some memory, so I updated the
 branch.
 > >
 > > Does that particular change (defining it in `AuthSrvImpl`) really
 > > contribute to the performance improvement?
 > >
 > > So, unless this particular change really improves the performance, I'd
 > > generally avoid this approach.
 > OK, I updated my branch (git 2afe338).
 > I've compared query rate with query_bench using root zone, 50% of
 > non-existence queries. It shows 1.72% of performance drop.

 Hmm, it's hard to believe...maybe we should perform benchmark check on
 the actual qps.  Hopefully it's marginal.

 > > - generate_docfile: at least for me it produces garbage:
 > > {{{
 > > b'<variablelist><varlistentry><term>request.v4</term>...'
 > > }}}
 > >   This also suggests why we need a test.
 > I couldn't reproduce it with Python 3.1 and Python 3.3.
 > Could you please let me know your environment?

 It's Python 3.2 on Mac OS X 10.8, but this doesn't happen with the
 revised code.  I guess the addition for the prettier print indirectly
 solves it.

 > > '''Other'''
 > >
 > > These files are updated in the revised version.  Am I expected to
 > > review these?
 > >
 > > - src/lib/statistics/counter.h
 > > - src/lib/statistics/counter_dict.h
 > > - tests/lettuce/configurations/example.org.inmem.config
 > > - tests/lettuce/features/queries.feature
 > > - tests/lettuce/features/terrain/bind10_control.py
 > > - tests/system/bindctl/tests.sh
 > Yes, could you please review them?

 Will do separately.

 At the moment I'm dumping comments on the revised version of code
 excluding the above.

 '''statistics.{h,cc}'''

 - about `Opcode`: I'd like to avoid using a valid type (like
   RESERVED15) for placeholder until it's set.  It can be an easy
   source of bug that we leak that value.  There are several ways to
   avoid that, but if the overhead is not big I suggest
   boost::optional.
 - type of opcde, maybe boost::optional<Opcode>?
 - `using xxx` in a header file is a dangerous practice and I'd avoid it:
 {{{#!cpp
 using isc::dns::Opcode;
 }}}

 - `TransportProtocol` sounds a bit awkward because these are enums and
   essentially identifiers, not "protocols".  Maybe something like
   `TransportProtocolID` or `TransportProtocolType`?
 - I didn't notice this before, but some of REQ_IS_xxx's sound awkward.
 {{{#!cpp
     enum BitAttributes {
         REQ_IS_EDNS_0,              // request is EDNS ver.0
         REQ_IS_DNSSEC_OK,           // DNSSEC OK (DO) bit is set in
 request
         REQ_IS_TSIG,                // request is signed with valid TSIG
         REQ_IS_BADSIG,              // request is signed but bad signature
         RES_IS_TRUNCATED,           // response is truncated
         RES_IS_TSIG_SIGNED,         // response is TSIG signed
         BIT_ATTRIBUTES_TYPES
     };
 }}}
   REQ_IS_EDNS_0 and REQ_IS_TSIG are particularly awkward.  They are
   not "EDNS" or "TSIG", they are a request that has EDNS or (signed
   with) TSIG.  Please revisit the naming.  Same applies to the
   comments.
 - `MessageAttributes` constructor: I suspect I don't have to
   initialize bit_attributes_ explicitly.
 - setRequestTransportProtocol(): I personally think assert() is too
   strong:
 {{{#!cpp
         assert(transport_protocol != TRANSPORT_UNSPEC);
 }}}
   because we cannot control what is passed to the method.
   Conventionally we throw in such cases.
 - "EDNS attributes" sounds too broad for what this method does.
   method name sounds misleading too (as if it returns an EDNS object
   or something).  Same for setRequestEDNS0 and `RequestSigTSIG`.
   and `ResponseSigTSIG`.
 {{{#!cpp
     /// \brief Return EDNS attribute of the request.
     /// \return true if EDNS version of the request is 0
     /// \throw None
     bool getRequestEDNS0() const {
         return (bit_attributes_[REQ_IS_EDNS_0]);
     }
 }}}
 - `setRequestSig`: if it's specific to TSIG, I'd rename it
   `setRequestTSIG` or setRequestTSIGAttributes` or something.
 - `setResponseSigTSIG`: do we need the `is_tsig_signed` parameter?

 '''auth_srv.cc'''

 - These don't seem to be very accurate:
 {{{#!cpp
     stats_attrs.setRequestIPVersion(
         io_message.getRemoteEndpoint().getFamily() == AF_INET ?
             MessageAttributes::IP_VERSION_IPV4 :
             MessageAttributes::IP_VERSION_IPV6);
     stats_attrs.setRequestTransportProtocol(
         io_message.getRemoteEndpoint().getProtocol() == IPPROTO_UDP ?
             MessageAttributes::TRANSPORT_UDP :
             MessageAttributes::TRANSPORT_TCP);
 }}}
   in that, technically, there's no guarantee that it's AF_INET6 if
   it's not AF_INET.  To be precise the logic should be something like:
 {{{#!cpp
   switch (getFamily()) {
   case AF_INET6:
     setIPversion(IPv6);
     break;
   case AF_INET:
     setIPversion(IPv4);
     break;
   default
     throw Unexpected;
  }
 }}}
  Also, I'm not sure if it makes much sense to do this in auth_srv.
  Even if it's enum it's not that much type safe (invalid value still
  can be passed via cast), and it looks like `AuthSrv` is handling what
  the statistics specific module should care about.

 - processUpdate: why adding the parameter if you don't use it?
 {{{#!cpp
 AuthSrvImpl::processUpdate(const IOMessage& io_message,
                            MessageAttributes&)
 }}}

 '''gen-statisticsitems.py'''
 - convert_list in generate_docfile: at least to me this is not intuitive:
 {{{#!python
                 if not prev is None:
                     prev.tail = prev.tail.rstrip()
 }}}
   I'd suggest changing it to `if prev is not None`
 - convert_list in generate_docfile: I don't understand the purpose of
   these changes:
 {{{#!diff
 -                        command.tail = ' '
 +                        command.tail = ''
 ...
 +                sim_para.text = sim_para.text.rstrip()
 +                if not prev is None:
 +                    prev.tail = prev.tail.rstrip()
 }}}
   please leave comments.  Also, in general, making a single big commit
   like this one just saying "address review comment" is not good.

 - generate_cxx: (discussion)

 >> generate_cxx: if I understand it correctly, its convert_list() doesn't
 work correctly only up to 1 recursion.

 > I think it should work. Anyway, if it doesn't work, unittest, systest
 and/or lettuce fails.

 I just noticed my original text was broken, so rephrasing it:
 convert_list() doesn't work correctly with two or more recursive
 calls.  If you still it should work, could you explain what kind of
 output is expected if the recursion at this point happens twice?
 {{{#!cpp
                 msg_counter_types, item_names_current_, item_names = \
                         convert_list(item['child'], msg_counter_types,
                                      item_names_current_, item_names)
 }}}
 as for the argument if it doesn't, as a last resort defense that's
 true (and I suspect in this case it wouldn't even compile), but, in
 general, errors should be caught at as early point as possible: we
 prefer test failure over run time surprise in a production
 environment; prefer compile-time error over test failure; and prefer
 code generator failure over compile error on the resulting code.  It's
 not really good to leave it open if we are not sure if it works in
 this scenario.

 So, if we know it really works with any number of recursion, that's
 fine; otherwise, we should either fix it or explicitly fail on the
 unexpected cases.  My current guess is it doesn't work, and in which
 case I'd just make the script fail explicitly.

 '''b10-auth.xml.pre'''

 - Can we now resolve this also?
 {{{
 <!-- TODO: missing stats docs. See ticket #1721 -->
 }}}

 '''statisticsitems.py'''

 - if we chose to write a test for this script, I'd rather write more
   standard unit test that directly checks main functions of it.  But
   at this stage we should focus on completing this task, so I'm fine
   with this version test.
 - I'd rename it to, e.g., gen-statistics_test.py in order to clarify
   it's the tests for gen-statistics, not some helper module for other
   tests or tests for something other than gen-statistics.
 - per our name convention for Python `testXMLfile` would have to be
   named `test_xml_file`.

 '''auth_srv_unittest.cc'''

 - `queryCounterTruncTest`: it seems to use sqlite3 data source.  If
   so, it should be skipped for static-link environments like this:
 {{{#!cpp
 TEST_F(AuthSrvTest,
 #ifdef USE_STATIC_LINK
        DISABLED_chQueryWithInMemoryClient
 #else
        chQueryWithInMemoryClient
 #endif
     )
 }}}

 - By "It would also be good if we simplify and unify the verbose
   pattern seen in (e.g.) queryCounterUDPNormal." in my previous
   comment, I intended to introduce something like a unified
   comprehensive check like `checkCountersAreInitialized()`
   with parameterized exception lists to specify the counters that are
   expected to have non 0 values.  The current revised version seems
   to not only "simplify" it but also actually "remove" some cases that
   were previously tested (confirming some counters are actually
   unchanged).

 '''statistics_unittest.cc.pre'''

 - To guard EXPECT_DEATH, I suggest using EXPECT_DEATH_IF_SUPPORTED
   instead of `ifdef`
 - Regarding the previous point, check also the comment on the tested
   code; I thought assert() was too harsh in this case.
 - result of bit operation should be directly used in a boolean
   context (http://bind10.isc.org/wiki/BIND9CodingGuidelines "Testing
   Bits").  I've fixed them myself.
 - naming this `af` is a bit awkward because it's not an `AF_xxx` value:
 {{{#!cpp
         const enum MessageAttributes::IPVersion af = (i & 1) != 0 ?
             MessageAttributes::IP_VERSION_IPV4 :
             MessageAttributes::IP_VERSION_IPV6;
 }}}

-- 
Ticket URL: <http://bind10.isc.org/ticket/2157#comment:43>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list