BIND 10 #2387: support generic version of rdata::createRdata(text) in DNSKEY, NSEC3, NSEC3PARAM

Thu Mar 28 03:00:29 UTC 2013

#2387: support generic version of rdata::createRdata(text) in DNSKEY, NSEC3,
NSEC3PARAM
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  task          |  jinmei
            Priority:  medium        |                       Status:
           Component:  libdns++      |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130402
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  5             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |  loadzone-ng
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by muks):

 * owner:  muks => jinmei

Comment:

 Hi Jinmei

 Replying to [comment:14 jinmei]:
 > '''nsec3param_common.h'''
 >
 > - This doesn't seem to be very correct:
 > {{{#!cpp
 > /// \exception MasterLexer::LexerError There was a failure reading a
 > /// field from the MasterLexer.
 > }}}
 >   "failure" sounds like a low level (quite unexpected) error such as
 >   an I/O error, but `LexerError` is more expected to be thrown against
 >   syntax errors (such as a string given when a number is expected or
 >   an unexpected end of input).

 This comment has been updated now.

 > '''dnskey_48.cc'''
 > - the doxygen doc for the constructors share some amount of text.  I'd
 >   unify them, e.g., by referring from one to the other.

 Done.

 > - is this good?
 > {{{#!cpp
 >     while (true) {
 >         const MasterToken& token = lexer.getNextToken();
 >         if (token.getType() != MasterToken::STRING) {
 >             break;
 >         }
 >         keydatastr.append(token.getString());
 >     }
 > }}}
 >   This way, even if the lexer founds a syntax error, it exits from the
 >   loop just like the normal case and the error detection is left to
 >   the caller.  It seems to me better if we keep extracting strings
 >   exclusively until we see end of line/input and catch any
 >   intermediate error at the exact point.  BTW, I found nsec_bitmap.cc
 >   has the same issue.

 I have implemented this. But I don't know of a way to test error cases
 (except data after end-of-line), as even delimited numbers show up as
 string tokens.

 > - probably marginal, but it'd be slightly more efficient if we use the
 >   other version of `getString()`
 > {{{#!cpp
 >     string keydatastr, keydata_token;
 >     while (true) {
 >         const MasterToken& token =
 >           lexer.getNextToken(MasterToken::STRING, true);
 >         if (token.getType() != MasterToken::STRING) {
 >             break;
 >         }
 >       token.getString(keydata_token);
 >         keydatastr.append(keydata_token);
 >     }
 > }}}
 >   (and while it's out of scope of this ticket, the same point applies
 >   to buildBitmapsFromLexer() regarding `type_str`).

 This is now updated.

 > - not directly related to this ticket, but what's the rationale of
 >   this?
 > {{{#!cpp
 >     if (keydatastr.size() == 0) {
 >         isc_throw(InvalidRdataText, "Missing DNSKEY digest");
 >     }
 > }}}

 This is to check that there is some key data that was read before
 end-of-input was reached. This serves the same purpose as `have_rrtypes`
 in `buildBitmapsFromLexer()`. I've updated the code with a comment.

 > - ditto, but this one too:
 > {{{#!cpp
 >     if (algorithm == 1 && keydata.size() < 3) {
 >         isc_throw(InvalidRdataLength, "DNSKEY keydata too short");
 >     }
 > }}}

 This code was not introduced by this branch, but was present in the
 previous implementation. It is to support RSA/MD5 tag extraction:
 http://tools.ietf.org/html/rfc4034#appendix-B.1

 See `DNSKEY::getTag()`. I've updated the code with a comment pointing to
 the RFC.

 > - and why don't we do the same check for the "from wire" case if we
 >   need to do it?

 I don't know why this was not done previous to this branch. This branch
 only touched the `MasterLexer` related parts of the code, so it didn't
 modify the wire cases. I've added code towards this now.

 > - and, if we allow empty key this code is not really correct:
 > {{{#!cpp
 >     buffer.readData(&keydata[0], rdata_len);
 > }}}
 >   because accessing `[0]` is invalid for an empty vector.

 The content of the public key field is dependent on the choice of
 algorithm. I guess it's safe to assume that the public key should always
 be non-empty. So I've added a check that the rest of the rdata is at
 least 1 byte long, after reading the initial fields.

 The other option is to add checks specific to algorithms, but we can
 avoid that here.

 > '''nsec3_50.cc'''
 > - redundant doc (see dnskey comments)

 Done.

 > - this can be a reference:
 > {{{#!cpp
 >     const string nexthash =
 lexer.getNextToken(MasterToken::STRING).getString();
 > }}}

 I've added the reference now. (I'm still unsure about where we can use
 references when an object is returned such as the case above; I'll go
 through the C++ spec).

 >  (and the line is too long)

 The line was 80 columns long. There is nothing in our CodingGuidelines
 about max line length, so I assumed 80 columns. I've added this to the
 CodingGuidelines page now:
 http://bind10.isc.org/wiki/CodingGuidelines#Linelength

 If you are using emacs and your terminal is set to 80 columns, emacs
 will wrap the line at 79 cols to show the '\' wrap marker. You can use
 `column-marker.el` for checking if something exceeds 80 cols (it will
 show a background highlight in some shade of red if a line goes over
 80).

 > - this can be a reference:
 > {{{#!cpp
 >     const MasterToken token = lexer.getNextToken();
 > }}}

 Done.

 > - is it possible to avoid doing `ungetToken()` here:
 > {{{#!cpp
 >     if ((token.getType() == MasterToken::END_OF_LINE) ||
 >         (token.getType() == MasterToken::END_OF_FILE)) {
 >          empty_bitmap = true;
 >     }
 >
 >     lexer.ungetToken();
 > }}}
 >   by unifying it in `buildBitmapsFromLexer`?  `ungetToken` for a
 >   string is a bit expensive because you'd need to copy it twice.

 These have been unified. It resulted in cutting down a bit of code too.

 > '''nsec3param_51.cc'''
 > - redundant doc (see dnskey comments)

 Done.

 > '''masterload_unittest.cc'''
 >
 > Is that intentional there are two '#' signs here?
 > {{{
 > // doesn't support them. This test should be fixed and re-enabled by
 > // ##2381, or deleted.
 > }}}

 No it was not intentional. Removed. :)

 > '''tests in general'''
 >
 > It looks suboptimal that string and lexer versions have different sets
 > of tests (e.g., badText and createFromLexer for DNSKEY).  They cause
 > duplicates/redundancy, as well as incomplete cases.  I'd generally
 > unify these cases.  See, for example, `Rdata_IN_A_Test::createFromText`.

 These have been unified (as much as it makes sense). Some are still
 directly tested.

 > '''rdata_dnskey_unittest.cc'''
 >
 > - This case is now moot:
 > {{{#!cpp
 >     // Missing algorithm
 >     EXPECT_THROW(generic::DNSKEY("257 3 5BEAAEFTd"), InvalidRdataText);
 > }}}
 >   This test was specifically crafted for this code:
 > {{{#!cpp
 >     unsigned int flags, protocol, algorithm;
 >     stringbuf keydatabuf;
 >
 >     iss >> flags >> protocol >> algorithm >> &keydatabuf;
 > }}}
 >   which would incorrectly accept the above invalid input.  It doesn't
 >   make much sense to use the same test case because our lexer is never
 >   confused with token boundary this way.  If we really want to check
 >   the "missing algorithm" case, we should at least make the base64
 >   part valid; otherwise it may not be really clear if the
 >   implementation correctly detects the algorithm is missing.

 These tests were not written with implementation in mind. It doesn't
 matter to the test if a `MasterLexer` is being used, or if its behavior
 is such that this case will never occur. It tests the construction from
 RDATA as a black box.

 I've updated it to use a valid base64 encoded key.

 > - Likewise, this one:
 > {{{#!cpp
 >     // How about this?  It's even more confusing for the parser because
 >     // it could be ambiguous '51 EAAA' vs '5 1EAA..'. But we first parse
 >     // the other fields before we parse the public key, so we reject
 >     // this.
 >     EXPECT_THROW(generic::DNSKEY("257 3 51EAAEFTd"), InvalidRdataText);
 > }}}
 >   "how about this?" part doesn't make sense any more, and it's not
 >   really confusing at all (the input is obviously invalid).  And, it's
 >   not even clear whether we need two cases like this.

 This test was actually identical (I didn't notice it before), so I've
 removed it now.

 > - cases where a non number is used for a number field are missing.
 >   checking spaces in base64 also seems to be missing.  there may be
 >   some other missed cases.  Please check corner cases as if it were
 >   developed test-driven.

 The existing code missed many test cases. I don't know what "as if it
 were developed test-driven" means here. If this is to say that
 `master` branch misses many tests and we should add them now to make up
 for it, I have done this and tested the branch for the cases I could
 think of. You are pointing some in the review, which will be added too.

 If this statement is implying that new code in this branch was not
 tested (with covered tests) as it was being written, there seems to be a
 misunderstanding. Is this gathered from the commit sequence?  Sometimes
 I add checks immediately after writing the code, that I think of when
 writing the implementation. But these are added in the next few commits.
 Sometimes, indeed some bugs are found in implementation only later, or
 some missing tests are thought of when reviewing code and they are
 fixed/added later.  But this doesn't mean that the code is not covered,
 or it was written without intention to test it.

 > '''rdata_nsec3_unittest.cc'''
 >
 > - number vs non-number cases seem to be missing.  there may be some
 >   other.

 Number vs. non-number are tested.

 > '''rdata_nsec3param_unittest.cc'''
 >
 > - number vs non-number cases seem to be missing.  bad hex case is
 >   missing.  there may be some other.

 Number vs. non-number are tested. Bad hex case is also tested.

 I have also patched an offshoot of #2391.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2387#comment:16>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development