BIND 10 #781: Define cryptographic API

Mon Apr 18 23:35:42 UTC 2011

#781: Define cryptographic API
-------------------------------------+-------------------------------------
                 Reporter:  stephen  |                Owner:  jinmei
                     Type:  task     |               Status:  reviewing
                 Priority:  blocker  |            Milestone:
                Component:           |  Sprint-20110419
  Unclassified                       |           Resolution:
                 Keywords:           |            Sensitive:  0
Estimated Number of Hours:  6.0      |  Add Hours to Ticket:  0
                Billable?:  1        |          Total Hours:  0
                Internal?:  0        |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:18 jelte]:
 > > '''Trivial things'''
 > > - I made some trivial editorial changes.
 > > - I also changed 'catch exception by object' to 'by const reference'.
 > >   (I suggest you try cppcheck.  It's very good at finding this issue)
 > >
 >
 > thanks :) (i don't really use cppcheck much yet since it is a tad noisy
 here, we should probably put in a bit of effort to make it clean and keep
 it that way)

 FYI on cppcheck: Older versions of cppcheck seem to complain about
 more things (in some sense it's good, but it also means relatively
 more false positives).  I'm using 1.47, and it should run cleanly (via
 'make cppcheck' at the top_dir).  We even run it on a buildbot.

 BTW: I've added this level of these fixes to the revised code, too.

 '''Some higher level discussions'''
 > > - I'm okay with the design decision of making this stuff DNS related
 > >   things only (but IMO it should be noted in the documentation).  But
 > >   in that case should it better be in libdns++?  [...]
 >
 > Right. Originally I did not intend to make libdns make direct use of
 libcrypto, so having a dependency the other way wouldn't be a problem.
 Right now we have no plans to use cryptographic functionality outside of
 libdns++ (as far as i'm aware), but there is quite a big chance this may
 change in the future. Another question is how 'clean' we want libdns++ to
 be regarding non-dns specific things (like crypto :)).

 Now that the mutual dependency problem is going to be solved, I'm okay
 with separating these.

 > > - I'm still not sure why we need to use OutputBuffer to store
 > >   results.  Isn't a simple vector (or even a plain old array because
 > >   we should be able to know the size) sufficient?
 >
 > We don't *need* it, but i figured outputbuffer would me the most logical
 type to have for this, which would save the caller from another
 checkoutputlength/alloc set of calls. A vector would do that as well, but
 will cause yet another copy afterwards. This was under the assumption that
 the caller would have an outputbuffer already anyway, so as i noted in my
 previous comment, it should be pretty easy to change or add sign() calls
 with the needed types.
 >
 > Anyhow, I added both for flexibility. I also added a size_t len argument
 for all three, so we can support truncated signatures.

 Hmm, maybe you're thiking about updating a buffer to construct a
 complete DNS message with a TSIG digest "on the fly"?  That is,
 conceptually like this?

 {{{
    message.renderHeader(buffer);
    message.renderAnswer(buffer);
    message.renderAuthority(buffer);
    message.renderAdditional(buffer); // without TSIG
    message.renderTSIGParameters(buffer); // render params before digest
    signHMAC(buffer.getData(), buffer.getLength() - sizeof(TSIG), buffer);
 }}}

 If so, this is actually different from what I'd do for signing.  I
 suspect, due to various corner cases that would require the
 incremental (init-)update-final steps, we'd normally end up getting
 the digest in a separate space and copying it to the buffer/renderer.

 But anyway, we cannot just be so sure how it would actually be used
 yet, so I don't necessarily request this interface to be removed at
 this stage.

 > > - I noticed the new library is named "libbcrypto" (double b). [...]
 >
 > It was intentional, what about libb10crypto? (to make it more clear
 > >   that it is not a typo :), and leaves the option open to use it
 > >   for non-dns things)
 >
 > Changed it to that now.

 Frankly, libb10crypto sounds a bit awkward:-) but I must confess I'm
 not good at naming things.  I just come up with a generic name
 "libcryptolink", but since at least b10crypto is much less confusing
 I'm okay with it, too if you like it better.

 > > - do we need a python wrapper for the new crypt stuff?
 >
 > Probably, but it seemed like a good idea to agree on the API first.

 Fair enough.

 > > - getHash(): is the overhead of this function acceptable? (pure
 > >   question, I'm not sure).  For sha256, we need name comparison
 > >   (relatively expensive) three times, and it happens every time we
 > >   need to sign/verify hmac.
 >
 > Well, this comparison has to occur somewhere when going from a TSIGKey
 to an underlying algorithm. Since for the TSIGKey removal I introduced an
 enum for hmac hash algorithms, this is no longer a problem of this
 specific lib :)

 Okay, we may want to convert string/name to the crypto definition at
 the time of TSIGKey construction, but that's another issue.

 > > - I suspect the way of defining 'init' can cause a static
 > >   initialization fiasco (of course it depends on the details of
 > >   LibraryInitializer, which I didn't dig into.  this is a general
 > >   concern)
 >
 > I 'copied' the approach from botan; i.e. there is a RAII object for
 > >   this now, and the problem is passed off to the program using
 > >   this lib. We need something like this in the future anyway for
 > >   initialization options.

 Okay, this is much safer, but the new approach seems to have another
 issue: a buggy program can instantiate it multiple times (not sure if
 it causes a bad thing, though).  Also, do we really expect it to be
 destructed at the end of the program?  If not we don't even need the
 RAII trick; we can simply provide an initialization function, say,
 initCrypto(), and have the main program call it (this approach doesn't
 solve the duplicate initialization problem, though).

 Beside, requiring programs to do something specific (either
 instantiate an RAII object or call an init function) is inconvenient;
 I guess you were aware of that and that's why you hid it in crypto.cc
 in the first implementation.  IMO safety is more important than
 convenience in general, so if the policy is to accept this
 inconvenience, I'm okay with that.  But if we want to avoid that while
 avoiding the initialization fiasco, one approach would be to hide the
 initialization logic in the wrapper classes such as HMACImpl.  This
 approach has a drawback that we must not forget doing it in every
 wrapper class, so this is basically a design tradeoff.  I'm okay with
 any safer approach.

 > > - HMACImpl constructor: this is not exception safe (hmac_ could leak).
 >
 > added delete

 Ack, but to be super exception safe, I'd rather (e.g.) use
 boost::scoped_ptr.  I don't know the details of the Botan internals,
 but, for example, if it internally performs (throwing version of) new
 and doesn't convert the standard exceptions, the revised code is still
 not exception safe.  Using RAII like scoped_ptr will ensure the
 safeness regardless of the Botan internals.  (And if we use scoped_ptr
 we don't need an explicit destructor)

 > > - HMACImpl::update/sign/verify: can we assume the underlying Boton
 > >   methods don't throw?
 >
 > only invalidkeylength, afaict, should we add 'general'
 > Botan::Exception catches?

 It might depend on the internal details of Botan, but as long as we
 want to prevent any Botan specific exceptions from being propagated,
 I think it would be safer to catch larger exceptions.  That way we can
 ensure that even if the Botan implementation changes in its future
 versions.

 > > '''crypto_unittests.cc'''
 > > - checkBuffer: in ASSERT_EQ() the expected data is the first
 > >   parameter.  it seems to be reversed.  Also, you may want to use
 > >   UnitTestUtil::matchWireData in lib/dns/tests for this type of test.
 >
 > That would introduce a build-order-dependency conflict again :)

 Yeah I know.  So for now it probably doesn't make sense to use
 matchWireData.  In future I think we should extract non DNS specific
 test utilities into a shared place so that it can be used by other
 modules.

 > > - I'm not sure if the :-separated notation is a good one.  [...]
 >
 > we already have the three-part constructor (the original one with Name
 args). This is simply to make it possible to use the de facto standard for
 the string representation, as introduced by dig and copied by multiple
 other implementations. If the name contains a ':', it will simply fail to
 create a key using that representation. We could add logic for escapes
 etc. But we could also simply say not to use names with :.

 Okay.  I have more comments about the implementation, though.  See below.

 > > - in that constructor, is it a good idea to catch all isc exceptions?
 >
 > No, but it's better than enumerating them all and catching them one by
 one (and almost certainly missing one or two). I think we should have a
 hierarchy of exceptions actually (like nameparseerror that is a superclass
 for emptylabl, toolonglabel, etc.etc.). Probably something to discuss
 and/or make a task for.
 >
 > >   can't we simply let various exceptions be propagated?
 >
 > Then the caller would have the same problem. This way I can honestly say
 that the \exception list in the docs are complete :) (and, more
 importantly, that the caller does not have to take 15 different exceptions
 into account, which may or may not change, not to mention the fact that
 the documentation will probably not be updated if it does indeed change).

 For these points, I basically envision that higher level applications
 catch exceptions at the level of isc::Exception (or even
 std::exception, from which all isc::Exception classes are derived),
 just like b10-auth does (in which case it returns SERVFAIL).  The
 above questions of mine are based on this observation.  But I admit
 this is probably a debatable point, and so I'm okay with keeping the
 current code and defer the discussion to a separate ticket or dev list
 thread.

 '''proposed changelog entry:'''
 > [func] Introduced an API for cryptographic operations. Currently it only
 supports HMAC, intended for use with TSIG. The current implementation uses
 Botan as the backend library.

 I'd note that build now requests Botan.  I'd also mark this entry with
 '*' due to the additional build time dependency.

 Now some more code level comments on the revised one:

 '''crypto.cc'''
 - HMACImpl::verify: is the "len > getOutputLength()" case tested?
   (just checking)
 - HMACImpl constructor now doesn't have to be "explicit" (because it
   takes more than one parameters).  but I don't mind if you
   intentionally left it so that e.g. we don't forget to re-add it if
   and when we re-revise it to a single-parameter constructor.  same
   for the HMAC constructor.
 - HMAC::HashAlgorithm I'd add doxygen description for the enums (even
   though the meaning is pretty clear from the symbol).  also, I'd like
   to see clarification of what "UNKNOWN" means (e.g. whether it's just
   a placeholder and always trigger some error, or if it can be used
   in some valid scenario)

 '''crypto/tests'''
 - in checkData(), expected/actual parameters to ASSERT_EQ() still seem
   to be reversed (maybe it's helpful to name them "expected_data",
   "actual_data" instead of "data1"/"data2")
 - doHMACTestArray() uses a variable length array, which is
   (unfortunately) not very portable.  considering this is a test,
   maybe we can just allocate a memory dynamically, noting it's not
   exception safe as a comment (we could also use
   std::vector/boost::scoped_array)
 - about secret/expected data (I should have brought this up in the
   first review, sorry for the raising a new topic): the arrays can be
   const uint8_t[].  as for std::string data, shouldn't it actually
   better be vector<uint8_t>?  In either case, the initialization using
   the for loop could be simplified:
 {{{
     const std::string data3(50, 0xdd);
     // std::string data3;
     // for (int i = 0; i < 50; ++i) {
     //    data3.push_back(0xdd);
     //}
 }}}
   (as a bonus we can also make the variable const this way)

 '''tsigkey test'''

 - I'd use a different name than "TSIGTest" (for TSIGKeyFromToString)
   because I'd like to use it for TSIG sign/verify (I hope it's fair to
   say "TSIGTest" is more suitable for these kinds of tests:-).  google
   test doesn't seem to allow name conflict at this level.

 '''tsigkey.cc'''
 - TSIGKey from string constructor: (I should have pointed this out in
   the first review) you may want to make it an 'explicit' constructor.
 - TSIGKey from string constructor: this seems to be incorrect:
 {{{
         } else {
             pos2 = str.size() - pos;
         }
 }}}
   You can realize the bug by making the len(key name) <= len(secret):
 {{{
     EXPECT_EQ("example.:MSG6Ng==:hmac-md5.sig-alg.reg.int.",
               TSIGKey("example.:MSG6Ng==").toText());
 }}}
   We could fix it in a straightforward way, but I actually wonder
   whether we could achieve what we want using a higher level
   interface, rather than via lower level of "position + length" APIs,
   which are generally error prone and difficult to understand.
   So, for example, how about the attached diff?  This one is not that
   concise compared to the original code, but the logic should be more
   intuitive and much safer.
 - Regarding the previous point, if you choose fixing the original
   code, I suggest deferring the definition of "secret_str" just before
   it's needed.
 - And, in either case, it would be good if we extract the alg_name
   check used in the two constructors.

-- 
Ticket URL: <http://bind10.isc.org/ticket/781#comment:19>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development