BIND 10 #2371: define dns::MasterLexer class

Mon Nov 5 18:20:45 UTC 2012

#2371: define dns::MasterLexer class
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:         |  Sprint-20121106
  medium                             |            Resolution:
                  Component:         |             Sensitive:  0
  libdns++                           |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  5
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
  loadzone-ng                        |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:15 muks]:

 > * This can be removed from `master_lexer.cc` now I think:
 > {{{
 > -#include <sstream>
 > }}}

 Ah, good catch.  Removed.

 > * I am trying to follow why the `END_OF_STREAM` definition became a
 duplicate for you. It worked here on Fedora GCC 4.7.2, so I assume this is
 a difference among compiler implementations. What compiler and version did
 you use to compile it?

 First to be clear: #2369 itself compiles.  The problem happened when I
 integrated it to this branch, where two files include inputsource.h.
 I use clang++, a specialized version by apple based on LLVM 3.1.  I
 guess you can reproduce it yourself on the macmini buildbot if you're
 interested.

 My guess is that in your case the linker combines the two definitions
 of the same constants (as they are constant) as a kind of optimization.

 > Also, does it have to be defined in
 `master_lexer_inputsource_unittest.cc`? That place seems wrong to me. Is
 it not because while there are protos everywhere, there's no actual
 storage allocation for it anywhere? Was it the compiler or linker which
 complained?

 It's a linker error:
 {{{
 Undefined symbols for architecture x86_64:
   "isc::dns::master_lexer_internal::InputSource::END_OF_STREAM",
 referenced from:
       (anonymous
 namespace)::InputSourceTest_compactDuring_Test::TestBody() in
 run_unittests-master_lexer_inputsource_unittest.o
       (anonymous namespace)::InputSourceTest_compact_Test::TestBody() in
 run_unittests-master_lexer_inputsource_unittest.o
       (anonymous
 namespace)::checkGetAndUngetChar(isc::dns::master_lexer_internal::InputSource&,
 char const*, unsigned long) in run_unittests-
 master_lexer_inputsource_unittest.o
 ld: symbol(s) not found for architecture x86_64
 }}}

 This is a common problem we've seen, and we've applied the same
 workaround (see name_unittest.cc for example).  Note that we normally
 don't need the explicit definition - for example auth/name/query.c
 uses Name::MAX_WIRE but we don't need this hack there.

 I've now looked into it more deeply, and found that EXPECT_EQ tries to
 take a reference to the parameters:

 {{{#!cpp
   template <typename T1, typename T2>
   static AssertionResult Compare(const char* expected_expression,
                                  const char* actual_expression,
                                  const T1& expected,
                                  const T2& actual) {
     return CmpHelperEQ(expected_expression, actual_expression, expected,
                        actual);
   }
 }}}

 That should be why in some tests we need the explicit definition.
 Maybe depending on the compiler it optimizes when the expected or
 actual is essentially constant, or it actually creates an instance of
 the integer for the class constant (and so the test code can refer to
 its address).

 Now what should we actually do?  As you pointed out, defining it in
 the test is probably not a good hack.  Defining it in
 master_lexer_inputsource.cc could be one solution, but I'm not sure if
 this is the best solution, because we only need to take the address in
 tests, so by defining it in the main .cc we create one instance in
 memory that is not necessary for the production code.

 So, my latest suggestion would be to define a local variable in specific
 tests that require the definition of such class static constants and
 use it in EXPECT_xx:

 {{{#!cpp
 namespace {
 // explain why we define a separate variable here
 const int END_OF_STREAM = InputSource::END_OF_STREAM;
 ...
     // This should cause EOF to be set.
     EXPECT_EQ(END_OF_STREAM, source.getChar());
 }
 }}}

 Would that work for you too?

 > * Does this work for you instead, without complaining about duplicates:

 Yes.  But see above.

 > * The error condition reporting in `MasterLexer::pushSource()` seems
 inconsistent to me. For the two error conditions that may happen, we throw
 for one file-related problem, and we return `false` and an error string
 for another file-related problem, both in the same method to open a new
 file. Maybe we can simply document that it could throw
 `InputSource::OpenError` and let the caller handle both.

 Good point, and this is actually one of the things I intended to
 document but forgot.  After thinking about how we handle user errors
 in `MasterLexer`, I concluded the best is to report possible user
 errors via return value and some kind of error code, while throwing an
 exception for other "more-unexpected" events such as disk failure
 after opening a file.  It's different from what other classes of the
 library normally do, so we can discuss it if it really makes sense.
 For now, I've added the documentation about my considerations.

 BTW, if we decide to let the exception from `InputSource` go through
 the lexer, we should move it to the main master_lexer.h (whether it's
 in or `MasterLexer` class or defined at the isc::dns namespace level);
 we shouldn't disclose the existence of the InputSource class to the
 application in the first place...hmm, in that sense `ReadError` should
 actually be moved to master_lexer.h unless we decide to always catch
 it within `MasterLexer`.

-- 
Ticket URL: <https://bind10.isc.org/ticket/2371#comment:16>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development