BIND 10 #2375: implement MasterLexer getNextToken() and ungetToken()

Thu Nov 22 13:30:55 UTC 2012

#2375: implement MasterLexer getNextToken() and ungetToken()
-------------------------------------+-------------------------------------
                   Reporter:         |                 Owner:  jinmei
  jinmei                             |                Status:  reviewing
                       Type:  task   |             Milestone:
                   Priority:         |  Sprint-20121204
  medium                             |            Resolution:
                  Component:         |             Sensitive:  0
  libdns++                           |           Sub-Project:  DNS
                   Keywords:         |  Estimated Difficulty:  5
            Defect Severity:  N/A    |           Total Hours:  0
Feature Depending on Ticket:         |
  loadzone-ng                        |
        Add Hours to Ticket:  0      |
                  Internal?:  0      |
-------------------------------------+-------------------------------------
Changes (by vorner):

 * owner:  vorner => jinmei

Comment:

 Hello

 Replying to [comment:13 jinmei]:
 > > To clarify, so it should be a thin wrapper around the current
 getNextToken?
 >
 > Yes (depending on the definition of "thin", now that our body of
 > getNextToken() is also pretty small).  The other version of
 > `getNextToken` would call the currently defined one with specifying
 > some particular set of options, and handle the case of unexpected
 > EOL or unexpected string when a number is requested.

 Hmm, the branch is getting large enough. So I'd propose to put it into
 another ticket, then.

 > So, to me, the question to be asked is opposite: whether there's any
 > advantage of handling the exception here.

 > Maybe you don't like it because the total amount of code of this
 > version is even bigger than just using try-catch.  But at least it
 > simplifies the main code of `getNextToken()` and doesn't have the
 > performance issue with any type of try-catch implementation unless an
 > exception is actually thrown.

 I'd be opposed to that approach not only because the amount of code is
 larger.
 I find it much less readable than the try-catch one. The try catch clearly
 states the intention ‒ „Do the transitions and if anything goes wrong, try
 returning back to previous state“. While the version with guard would read
 something like „First, make sure we return to the previous state
 afterwards. Do
 the transitions. And, oh, well, I changed my mind. Don't return to the
 previous
 state.“

 Therefore, I removed the try-catch there.

 > I agree your version is more logical.  But I don't think it wise to
 > just ignore the fact that the BIND 9 version has been working for
 > quite a long time against many very unusual or even broken inputs.
 > So, while I'm okay with resetting last_was_eol in the end, I suggest
 > you really think about any possible problematic cases where the
 > difference can cause, rather than just dismissing it "because it's
 > more logical".  I'd also ask this simple question (why it doesn't
 > presever-restore last_was_eol) at bind10-dev (some core BIND 9
 > developers subscribe to it).  Actually I suspect it's less likely
 > you'll get an answer, but we can then at least know there's not a
 > simple reason for it, if any.

 I did write the email. But I still don't agree with your point. It can be
 seen
 from the other side too. One of the reasons why bind10 was started was the
 fact that bind9 was hard to maintain and read. And, supposing it was not a
 forgotten revert to previous state (which didn't manifest, therefore the
 difference wouldn't matter), but a sideeffect that was actually used
 somewhere,
 it would be one of the places where it would be really hard to read and
 needed
 to be eliminated, not copied. Bind10 is not supposed to be simple rewrite
 of bind9
 to C++.

 > Hmm, I'm not sure how it relates to this discussion, but that's a good
 > question anyway.  After clarifying and simplifying the state
 > transitions for our limited need, we basically have only two cases:
 > - everything is completed within State::start().  state() sets the
 >   token and returns NULL
 > - one transition from start() to other derived state.  Each state
 >   terminates the transition and its handle() always sets the token and
 >   returns NULL
 > This won't change when we complete the transition with the NUMBER
 > state.

 It indeed does relate to this discussion. If there is never a state that
 returns anything non-NULL from its handle() method, there'd be no test to
 properly test the return value is correctly handled without the fake
 class.

 So I changed the return type of handle to void (which mandates there is
 never a
 chain of states other than you describe), which can be tested by the
 current
 states without faking.

 > > I'm not sure it can be reasonably tested by read-only accessors. The
 goal of
 > > the fake state is to change the internal state and mess around with
 it.
 >
 > I know, but since `getNextToken()` itself doesn't rely on the internal
 > states much, I don't see the need for arbitrarily modifying the
 > states.

 But ungetToken does rely on the internal state to some level. It must know
 which variables to restore. I tried to catch them with some ungets on
 various
 places in input, but the effectiveness of the tests relies on the
 knowledge of
 internal implementation slightly. But maybe it's less bad than the fake
 state.

 > Another example is an unbalanced '(' followed by EOF.  We first need
 > to detect the error and then detect EOF.
 >
 > So, we should clearly do something, at least if we want to be
 > compatible with BIND 9.

 Now I see the problem.

 I don't know if we need to be compatible with bind9 in these very internal
 interfaces. But it sounds reasonable to report the error and then EOF,
 because the caller needs to know no more tokens (or errors) will get out.
 So I changed it.

 > Ah, right.  I've just pushed it.  I also made another minor editorial
 > fix on the old branch, which has been pushed too.

 Cherry-picked both.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2375#comment:15>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development