BIND 10 #1177: NSEC support in new data source
BIND 10 Development
do-not-reply at isc.org
Thu Sep 22 22:39:57 UTC 2011
#1177: NSEC support in new data source
-------------------------------------+-------------------------------------
Reporter: | Owner: jinmei
jinmei | Status: reviewing
Type: task | Milestone:
Priority: major | Sprint-20110927
Component: data | Resolution:
source | Sensitive: 0
Keywords: | Sub-Project: DNS
Defect Severity: N/A | Estimated Difficulty: 5
Feature Depending on Ticket: | Total Hours: 0
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by jinmei):
Replying to [comment:11 vorner]:
> > - I'm not sure if this comment is relevant, but at least NSEC should
> > be able to coexist with NS/CNAME (and as commented DS should be okay
> > with NS, too):
>
> I added the NSEC and NSEC3 (I expect that one is there too). The DS
needs separate
> bool variable, as it is not allowed with CNAME.
Hmm, that's true for NSEC3, but I see some subtle difference for that
case. NSEC3 RRs effectively belong to a separate name space of the
zone, so the rationale why the owner name of an NSEC3 can have other
types is different from the reason for RRSIG and NSEC. In that sense
I'd update the code comment to reflect the subtlety:
{{{
// NSEC and RRSIG can coexist with anything, otherwise
// we've seen something that can't live together with
potential
// CNAME or NS
}}}
(maybe it's sufficient to add NSEC3 in addition to NSEC though)
> > - operator+: I personally don't think this definition is intuitive.
> > I'd expect an 'operator+' (with two params) to take the same types
> > of arguments and to meet other natural expectations derived from the
> > arithmetic '+' operator (such as the commutative law). Also, while
> > admittedly this may purely be a matter of taste, I generally agree
> > with the concerns documented in the google's guideline:
> > http://google-
styleguide.googlecode.com/svn/trunk/cppguide.xml#Operator_Overloading
>
> Actually, I don't agree there at all with the google styleguide. Then
we're getting to the java world, where you have to write
string_a.equals(string_b), because it can't overload operators. That is
much less readable code than with ==.
Actually, I said "*generally* agree" due to such examples. My opinion
is that there are cases where operator overloading contributes to
better readability, and if the benefit outweighs the pitfalls I'd
rather prefer using them than dogmatically rejecting them. So it's
case by case basis, and I said "generally" because IMO different
people tend to have different views on clarity or intuitiveness and a
simple operator can often cause more confusion than clarity.
Now, for this particular case...
> As with this concrete operator ‒ adding an element to a set is something
I did see written this way in mathematics, mainly in graph theory and
combinatorics (and even substracting an element from set).
...I've not seen textbooks in these theoretical areas for more than a
decade so I don't quite remember the exact notation in such cases. I
can see it if the operator + is used exactly as the operation of
union, i.e., setA + { element_being_added }, and I can imagine that
the literature may introduce a convenient abbreviation of allowing
"{}" to be omitted.
But when I first saw the code, it was just confusing; I first thought
it was one of the magical class methods of std::map and went to see an
online manual (unsuccessfully), then I found the private definition in
database.cc with my natural expectation that it should be a symmetric
operator, then found it was not, surprised, read the definition more
closely, then finally understood it.
Once I followed all these steps, I saw the code was correct, and once
I understood it, it was not difficult to read the rest of the code.
But I needed to spend a certain amount of time and brain cycles until
I reached that point. In such a case I wouldn't consider the use of
operator overloading a reasonable abstraction. Note specifically that
I had to go to cplusplus.com before knowing this operator is actually
our own invention.
> So, to me, the use of the operator for this is much more intuitive than
+ with strings. You have a valid point with the commutative thing, but it
isn't used the other way around (in the code, nor in mathematics), even if
it could be, so I just saved myself writing another copy with switched
order of parameters.
> Even if the definition might be less intuitive, I think the intention is
clear where used and probably can't be confused with any other possible
use of the + operator.
As I said above, the intention is understandable (not sure if it's
"clear") once you follow the non trivial steps. But the question is
how intuitive and easy to understand the intent is to the majority of
the readers of the code who tries to understand and possibly update
it, compared to other, possibly more verbose ways. Due to the hassles
I explain, it's not so obvious to me at best.
Citing another reference as you don't seem to like the google
guideline, I also "generally" with what is written in the "C++ Coding
Standards" book on this matter. If you don't have it, go to
http://www.amazon.com/exec/obidos/ASIN/0321113586,
click on "look inside" and search for "Tensor". "Programmers hate
surprises" (I was surprised when I first saw the implementation using
the operator+). "Programmers expect operators to come in bundles" (I
did when I first saw the implementation, and I was confused as I found
the expectation didn't hold). "Named functions ... therefore should
be preferred for clearer code if there can be any doubt about
semantics" (at least I had a doubt about it when I first saw it,
although I can't speak for many others).
> > - Considering these, I'd suggest something like this:
> > - Define nsec_types etc via a proxy function:
> > {{{
> > const set<RRType>&
> > NSEC_TYPES() {
> > static set<RRType>* types;
> > if (types == NULL) {
> > auto_ptr<set<RRType> > types_ptr(new set<RRType>());
> > types_ptr->insert(RRType::NSEC());
> > types = types_ptr.release();
> > }
> > return (*types);
> > }
>
> That's what I tried before introducing the operator and it turned out to
be a lot of typing and quite a lot of code, which was much less readable
than simple inline list of types separated by + signs.
(I just noticed we could actually replace new + auto_ptr with another
static object, but that probably doesn't make it so different in this
context)
I admit this approach looks less stylish and requires longer lines of
code (which I generally don't like either). But regarding the
readability I don't necessarily think it's worse. If I encounter
code:
{{{
found = getRRsets(wildcard,
NSEC_TYPES(),
true);
}}}
I'd assume NSEC_TYPES would be effectively a constant object just
idiomatically returned by a proxy function. So I'd then search for
the definition of it, and will find it:
{{{
// Convenient shortcut for a set of { NSEC }
const set<RRType>&
NSEC_TYPES() {
static set<RRType> types;
static bool initialized(false);
if (!initialized) {
types.insert(RRType::NSEC());
initialized = true;
}
return (types);
}
}}}
it doesn't contradict my expectation, and at least to me pretty easy
to understand without any surprise, unlike the case with operator+.
We'll need to do the same thing for DELEGATION_TYPES(), FINAL_TYPES()
and WILDCARD_TYPES(), and I admit repeating the same pattern is dull,
but it seems unlikely that we have to continue this practice, so it's
basically one time effort. The only other places where we need to
perform insert() are these:
{{{
found = getRRsets(name.toText(), final_types + type, name !=
origin);
...
found = getRRsets(wildcard, wildcard_types + type,
true,
&construct_name);
}}}
which would look like
{{{
WantedTypes wanted_types(FINAL_TYPES());
wanted_types.insert(type);
found = getRRsets(name.toText(), wanted_types, name != origin);
...
WantedTypes wanted_types(WILDCARD_TYPES());
wanted_types.insert(type);
found = getRRsets(wildcard, wanted_types, true,
&construct_name);
}}}
and while it results in 4 more lines which is not good, I can
personally accept that.
> So, I'd rather like to keep the operator there. It's pity we can't
overload the , operator, because that one would be even better for the
list. But this is C++, not perl6.
(Not sure what you mean by we can't - C++ allows overloading
"operator," but that doesn't matter much in this context anyway) I
don't claim the proxy function approach is obviously better. While
personally I prefer that because it's more straightforward and can
understand it without a surprise, I also see its drawbacks. And
although I less prefer the overloading approach, I don't see a
technical flaw in it (unlike statically initialized namespace-scope
objects). If you still believe the approach with an overloaded
operator with taking account into all I explained above, especially
that different people have different senses of clarity and it may not
be so obvious as the original code author tends to think (to whom it
tends to be so obvious because it's based on his/her own sense), I
won't fight against that further. But I have two final
suggestions/comments in that case:
- If possible I'd hope using something different from operator+
because this one is quite confusing at least to me. I once thought
about operator<<, but probably you might not like that either
because it may look modifying the left operand.
- Because of the point that clarity/readability is in the eye of the
beholder, I'd suggest revising this comment:
{{{
// This arguably produces more readable code than having bunch of proxy
// functions and set.insert calls scattered through the code.
}}}
I'd simply say, following the previous paragraph, something like
"this will help keep the lines of code shorter".
> > - the changes in the first for loop is not a straightforward
> > refactoring of the original code, and it's difficult to be sure if
> > it really preserves the original behavior. For example, is the
> > behavior same when DNAME exists and the first 'if' conditions are
met?
>
> Well, as I get it, that is the exact reason why we have tests. All the
previous tests did pass, so the behaviour we are interested in should be
the same. And even when I decided the original code was too complicated
and I just rewrote this bit and it potentially acted differently in some
situation than the original, if this one looks correct (and it does to
me), it should be OK, because we did not try the original in anything but
the unittests anyway.
>
> So, is there a reason to stick to the exact original code?
In general, I don't necessarily oppose to piggy-backing improvements
(changing the behavior) as part of "refactoring". In fact I often do
it myself. I see two issues in this particular context, however.
First, tests only provide necessary checks, not sufficient ones. The
fact that the new code doesn't break existing tests means it doesn't
break the original behavior only covered by the tests. And, as I
noted in the subsequent review comments, there were in fact uncovered
part of the code. Second, I wouldn't be too much worry about
piggy-backing if the original code were reasonably simple and easy to
understand as I could be quite confident that the behavior change is
safe. In the case of this code, however, as we both seem to have
agreed the code is quite complicated, and I cannot be sure that the
behavior change doesn't have other adverse effects I'm missing.
So, in this specific case, I'd be much happier if we could make it
more gradually, i.e., fast refactor the code in its strictness sense
without introducing behavior changes, then improve the behavior based
on the (hopefully) now more understandable code.
That said, rewinding everything at this point may not be an effective
way of development. If you can list up things you changed with why,
and if it's reasonable, that's probably sufficient.
--
Ticket URL: <http://bind10.isc.org/ticket/1177#comment:13>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list