BIND 10 #256: bugs in base32.cc

Tue Jul 20 06:50:05 UTC 2010

#256: bugs in base32.cc
--------------------------------+-------------------------------------------
      Reporter:  jinmei         |        Owner:  jinmei                     
          Type:  defect         |       Status:  reviewing                  
      Priority:  major          |    Milestone:  06. 4th Incremental Release
     Component:  DNSPacket API  |   Resolution:                             
      Keywords:                 |    Sensitive:  0                          
Estimatedhours:                 |        Hours:                             
      Billable:  1              |   Totalhours:                             
      Internal:  0              |  
--------------------------------+-------------------------------------------

Comment(by jinmei):

 Thanks for the very careful review and detailed comments.

 > '''src/lib/dns/tests/rdata_rrsig_unittest.cc'''
 > '''src/lib/dns/tests/rdata_nsec3_unittest.cc'''
 > Differences OK.  Good idea to just put currently failing test cases in a
 disabled test rather than comment them out - it leaves them visible when
 the tests are run.
 >

 Agreed, these ifdefs are just an ugly leftover when I didn't know how to
 disable specific tests temporarily.  This is actually a separate issue
 from the main topic of this ticket, and fixing them is quite trivial, so I
 directly made that change on trunk.

 > '''src/lib/dns/tests/base32hex_unittest.cc'''
 > '''src/lib/dns/tests/hex_unittest.cc'''
 > Differences OK, although it took a few minutes to work out the logic for
 the decodeMap test (i.e. that a string with only the last character not
 equal to the zero character would result in a data stream with the last
 byte equal to the numeric value represented by the that character);
 additional comments here would be useful.  (Admittedly it would have been
 helpful to look at that test in hex_unittest.cc first, being the easier
 case.)
 >

 Okay, added some comments (r2528).

 > '''src/lib/dns/util/sha1.h'''
 > '''src/lib/dns/util/sha1.cc'''
 > Not checked.  It is assumed that these are Donald Eastlake's
 implementation of SHA1 and have just been moved from their original
 location.
 >
 > '''dns/util/base_n.cc'''
 > Header documentation in the file omits reference to the base16 classes.
 >

 Ah, good catch, fixed (r2535).

 > class !EncodeNormalizer
 > __class !DecodeNormalizer__
 >
 I believe r2535 addressed these comments.  Some specific points follow:

 > The intention of !EncodeNormalizer appears to be that when the iterator
 reaches base_end it will return a pad character, and continue to return a
 pad character no matter how many times it is subsequently incremented.
 However, when initialized with with both arguments equal (i.e. the current
 iterator being equal to the end iterator) a dereference operation will
 dereference the end iterator. Initializing in_pad_ to be equal to (base ==
 base_end) should fix this.
 >
 > A similar comment apply to !DecodeNormalizer; if initialized with base
 equal to base_beginpad, it should be regarded as being in the padding
 region.
 >
 I've thought about it, and chose to just make comments on the assumption
 without modifying the code.  As you also noted, these are only intended to
 be used within this file, each used only once, so it wouldn't be much
 beneficial to try to validate the arguments 100%.  Also, if we make it
 very sure they are valid, we should also think about other invalid cases
 such as base_end beyond the end of the valid range, base_beginpad_ >
 base_end_, etc.  Making these perfectly correct would complicate the
 resulting code, and I suspect the accuracy isn't worth the complexity in
 this case.

 > __struct BaseNTransformer__
 > The documentation in the header could be a bit clearer as to why the
 third and fourth template parameters in the BaseNTransformer class are
 what they are.  It took a bit of rooting through the boost documentation
 to work out that they are correct.
 >
 I admit these are not easy to understand.  I've added more information
 about these boost-derived classes at the top of this file.  Hopefully
 these address your comment.  (We might make it even more detailed, but it
 will eventually be the boost documentation itself.  So we need to seek
 some reasonable balance).

 > In the calculation of padbits in BaseNTransformer::decode, the use of "&
 ~7" is a neat trick to round down to the nearest multiple of 8.  It took
 me a few seconds to work out why it was there though - perhaps a comment
 would be in order?
 >
 I agree this is not trivial.  Added more comments.

 > Not sure that the comment "Confirm the original BaseX text is the
 canonical encoding of the data" is completely correct - all the code does
 is check that the padding in the decoded data does not conflict with the
 padding in the baseN-encoded text.
 >
 Again, I agree this is not trivial, and, as I re-read the code I found the
 implementation could be simpler for this purpose.  I've added more
 comments, and revised the code (the new code passes tests).

 > Not sure why this is a "struct" rather than a "class" like
 !EncodeNormalizer and !DecodeNormalizer.
 >
 I follow one common (but probably not so widely deployed) convetion: use
 struct for stateless class, and use class if it maintains states.  But,
 admittedly, I'm not sure if this separation is helpful, and I also suspect
 I'm not always consistent in this separation.  If you prefer consistency,
 I don't mind changing BaseNTransformer to a class.

 > '''src/lib/dns/Makefile.am'''
 > The files util/base16_from_binary.h and util/binary_from_base16.h are
 not in the list of libdns++ sources (but they are included in
 util/base_n.cc)
 >
 Another good catch, addressed (r2536).

 > '''src/lib/dns/tests/Makefile.am'''
 > OK (but see "Other Comments" below).
 >
 > '''base16_from_binary.h'''
 > '''base32hex_from_binary.h'''
 > '''binary_from_base16.h'''
 > '''binary_from_base32hex.h'''
 > OK - in both these cases the differences from the original boost file
 seem reasonable.  The tables have been checked against those in
 [http://tools.ietf.org/html/rfc4648 RFC4648] and appear to be correct.
 >
 > Comment: The check on "t" in the "to_n_bit::operator()" method is "<=
 127" in binary_from_base32hex.h and "< sizeof(lookup_table)" in
 binary_from_base16.h.  Both have the same effect (the lookup table is 128
 bytes long), although the latter is preferable.
 >
 Ah, I was not consistent here.  I begain with keeping the style of the
 original boost base64 implementation (for base32hex), but then departed
 from that style for base16 because I thought it was "better".  In any
 case, it's better to make them consistent, and as you pointed out, it's
 probably a bit better to use sizeof.  Changed it (r2537).

 > '''src/lib/dns/util/README'''
 > OK.  Although it is not clear why the classes should not be used within
 other modules of BIND10.  So far the contents of util are
 binary<-->base16/32 converters and a copy of Donald Eastlake's
 implementation of the SHA1 code in [http://tools.ietf.org/html/rfc4634
 RFC4634].  These are not really DNS-specific, so could be in a general
 BIND-10 library.
 >
 Good point, I tried to clarify it in a revised README (r2539).  Actually,
 I'm not 100% sure this is an agreed development policy.  Maybe we should
 discuss it in the weekly call.

 > '''Other Comments'''
 > The version of the code checked did not include any unit tests of the
 base16 classes.

 It should be covered in hex_unittest.

 I know the terminology isn't consistent.  I don't know why it was
 originally named using "hex" (in the code I revised), but it's probably
 because BIND 9 called it "hex" and RFC3658/5155 simply call the
 corresponding representation "hex digits".  On the other hand, in util I
 mainly refer to the terminology defined in RFC4648, where "hex" is a (kind
 of) alias of base 16, so I chose base16.

 But should we rather be consistent on the terminology?

 ...or, are you talking about something different?

-- 
Ticket URL: <http://bind10.isc.org/ticket/256#comment:5>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development