BIND 10 #356: Resolver address database

Mon Dec 13 08:18:22 UTC 2010

#356: Resolver address database
-------------------------------+--------------------------------------------
      Reporter:  stephen       |        Owner:  stephen  
          Type:  enhancement   |       Status:  reviewing
      Priority:  major         |    Milestone:           
     Component:  Unclassified  |   Resolution:           
      Keywords:                |    Sensitive:  0        
Estimatedhours:  0.0           |        Hours:  0        
      Billable:  1             |   Totalhours:  0        
      Internal:  0             |  
-------------------------------+--------------------------------------------
Changes (by ocean):

  * owner:  ocean => stephen

Comment:

 Replying to [comment:19 stephen]:
 > '''src/lib/nsas/tests/nameserver_entry_test.cc'''[[BR]]
 > Changes are OK but in checking them I noticed what appears to be an
 error in the checks in test !AddressSelection.  The first test attempts to
 check that three addresses are selected with equal probability and tests
 this by iterating 10,000 times and incrementing one of three counters (c1,
 c2, c3).  The test is then:
 > {{{
 >     // c1, c2 and c3 should almost be equal
 >     ASSERT_EQ(1, (int)(c1*1.0/c2 + 0.5));
 >     ASSERT_EQ(1, (int)(c2*1.0/c3 + 0.5));
 > }}}
 > But these two assertions are true if (for example) c1 = 4713, c2 = 3163
 and c3 = 2124, i.e. c1 > 2*c3 - which is a significant deviation from
 equal probability.
 >
 > (Figures obtained by noting that {{{(int) 1.99 = 1}}} then setting
 {{{c1/c2 = c2/c3 = 1.49}}} and {{{c1+c2+c3 = 10000}}}.  BTW the iteration
 count is 10,000 for the first loop in this test and 100,000 for the rest -
 I suggest that the higher number be used.)
 >
 > Similar comments apply to the ASSERT_EQ checks at the end of the test.
 >
 > Testing the output of a random number generator is an involved process.
 I am not an expected on statistics, but I analyze the first test (where we
 expect each of three addresses to be selected with equal probability) as
 follows:
 >
 > For each address, each time through the loop we expected it to be
 selected with a probability of p (= 1/3) and not selected with a
 probability of q (= 1-p = 2/3).  This suggests a binomial distribution,
 but as the number of selections (N) is large, we end up with a normal
 distibution with mean Np (= N/3) and variance of Npq (= 2N/9) (see for
 example [http://mathworld.wolfram.com/NormalDistribution.html]).  With N =
 100,000, the standard deviation (= sqrt(variance)) is about 149 or roughly
 0.15% of N.
 >
 > Although we want an even distribution, it doesn't really matter too much
 if the numbers vary slightly.  So checking that the actual number of
 selections of a given address is within 1% of its expected value is
 probably good enough.  With the figures used here, one percent of N away
 from the mean is over six standard deviations, and we expect that actual
 value will be closer than this to the expected value well over 99.99% of
 the time.
 >
 > In summary, testing that "abs(c1 - (N/3)) <= (N/100)" (and the same for
 c2 and c3) should check that they are selected with equal probability and
 work virtually every time.
 >
 > Note that slightly different values are required when the probabilities
 are unequal.

 Done. Correct the test methods to make sure that fabs(c - mu) < 4*sigma.

-- 
Ticket URL: <https://bind10.isc.org/ticket/356#comment:20>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development