BIND 10 #2218: update InMemoryZoneFinder::findNSEC3 using memory-efficient version
BIND 10 Development
do-not-reply at isc.org
Tue Sep 25 07:27:00 UTC 2012
#2218: update InMemoryZoneFinder::findNSEC3 using memory-efficient version
-------------------------------------+-------------------------------------
Reporter: | Owner: jinmei
jinmei | Status: reviewing
Type: task | Milestone:
Priority: | Sprint-20120925
medium | Resolution:
Component: data | Sensitive: 0
source | Sub-Project: DNS
Keywords: | Estimated Difficulty: 5
Defect Severity: N/A | Total Hours: 0
Feature Depending on Ticket: |
scalable inmemory |
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by muks):
* owner: muks => jinmei
Comment:
Hi Jinmei
Replying to [comment:11 jinmei]:
[snip]
> First, regarding `successor()`: at least your revised code doesn't use
> `successor()`, so it can (and I would say should) be private again.
> Next, as for `predecessor()`: you almost reimplemented the logic that
> is already available `DomainTree::previousNode` as a result of using
> `predecessor()`. Apart from whether it makes sense to expose
> `predecessor()`, the resulting code is duplicate and IMO much more
> unreadable. The whole re-invented logic can be much more simplified
> if you use previousNode:
>
> {{{#!cpp
> } else {
> const ZoneNode* last_node = chain.getLastComparedNode();
> while ((covering_node = tree.previousNode(chain)) != NULL &&
> covering_node->isEmpty()) {
> ;
> }
> if (covering_node == NULL) {
> covering_node = last_node->getLargestInSubTree();
> }
>
> if (!recursive) { // in non recursive mode, we are done.
> ...
> }}}
>
> If you agree that this one is simpler and more readable, then we also
> don't need to make `predecessor()` public either, and I'd suggest
> keeping it private.
[snip]
> > I don't follow what you mean by exploit here. Do you mean not have the
tree-of-tree structure? `TreeNodeRRset` gets its name directly from the
node in the tree-of-trees.
>
> What I meant (re-reading it now, I admit the original comment wasn't
> clear) is that I'd rather define and use something like
> `TreeNode* DomainTree::largestNode()`, which returns the largest node
> in the tree of tree. In the case of the NSEC3 tree, this should be
> the same as "largest in subtree" (in the subtree of NSEC3 hash names).
>
> I admit the encapsulation of `DomainTree` isn't really clean, and we
> sometimes allow the applications to use the knowledge of its internal
> structure. But when possible, I'd like to hide such details as it's a
> "tree of tree" as much as possible, and have applications use
> `DomainTree` just like some kind of abstract container. The concept
> of "getting the largest entry" can exist for any type of container
> that has the notion of ordering, but `getLargestInSubTree` looks too
> much rely on specific internal details of it. It might be less likely
> that we'll fundamentally revise the internal details (e.g., like
> changing tree to hash), but if we have choices for some purpose that
> have comparable costs, I'd like to choose the one that can hide
> internal details more.
I follow what you're trying to do, i.e., keep `DomainTree` encapsulated as
much as possible. In the first case, I used `predecessor()` because
`previousNode()` can exit the sub-tree and go to the upper level. I've
changed it so that it uses the code above now.
Replying to [comment:12 jinmei]:
> Replying to [comment:10 muks]:
>
> > > - The original code takes care of the case where there's no NSEC3
(but
> > > there maybe NSEC3PARAM). Shouldn't we do the same?
> >
> > We do in the loader. I am not following where in the original code
that you are referring to.
>
> I meant this by "original":
> {{{#!cpp
> const NSEC3Map& map = impl_->zone_data_->nsec3_data_->map_;
> if (map.empty()) {
> isc_throw(DataSourceError,
> "findNSEC3 attempt but zone has no NSEC3 RR: " <<
> impl_->origin_ << "/" << impl_->zone_class_);
> }
> }}}
> and, as far as I can see, loader doesn't ensure we can skip the same
> check. See relevant comments for the tests below.
Such a check was added after the last review (to see if the tree is
empty).
>
> Rest of my code comments follow::
>
> '''domaintree.h'''
>
> - I don't see the need for `getSubTreeRoot()`. Why was this added?
> (I've not reviewed corresponding tests as I was not sure if we
> really need this method).
The `getSubTreeRoot()` was a utility used by both `getUpperNode()` and
`getLargestInSubTree()`. I've made these functions private now.
> - Why did you change the type of node_count_ to size_t? I suspect
> sizeof(size_t) can often be 8 for 64-bit machines, and with the
> existence of `needsReturnEmptyNode_` it would make the
> `sizeof(DomainTree)` unnecessarily large. It can be a waste if we
> support millions of zones. I admit 'unsigned int' might not be a
> good choice either, though. Maybe the best one is uint32_t, and
> explain why we specifically choose it.
This was changed because `size_t` is the appropriate type for memory sizes
and memory-sized counters (like number of items of something, unless there
is known to be a smaller hard-limit). Before, there was a mix of `int` and
`unsigned int`. I've changed it to `uint32_t` now.
>
> '''memory_client.h/cc'''
>
> - maybe not really related to this branch, but I suspect we shouldn't
> make findZone2() a const member function, because it effectively
> returns a mutable internal state. Also, it would better to be not
> virtual (so it won't be misleading). I'd probably choose a
> different name. And, if it's mainly for tests I'd note that in the
> documentation (if it's expected to be used for general use, I think
> we should reconsider it - returning internal zone data seems to be a
> risky interface).
I've renamed it to `findZoneData()` and made it non-virtual, non-const,
and added a comment about its use in testing.
> - This now-removed comment makes me nervous
> {{{#!cpp
> - // This variant of findZone() is not implemented and should be
> - // removed eventually. It currently throws an exception. It is
> - // required right now to derive from DataSourceClient.
> - virtual isc::datasrc::DataSourceClient::FindResult
> - findZone(const isc::dns::Name& name) const;
> }}}
> It said "should be removed eventually", but this branch did the exact
> opposite. What happened?
That comment was added by me in #2108 as I thought we'd be going towards a
different interface for `findZone()`. But we kept the same interface, so
the comment was removed.
> '''treenode_rrset'''
>
> - I don't see why we need to implement getRRsig() in this branch. But
> if you wanted to implement it, please be responsible for providing
> tests. The documentation needed to be updated too. (But I did both
> in #2219, so you don't have to address these in this branch).
It was used in the tests inside `rrsetsCheck()`. The `TreeNodeRRset` tests
were missed, but this needed to work for the zone finder tests to pass.
I'll skip adding tests in this branch as #2219 has them and we'll get them
when these merge. Same for adding `getTTL()` too, but that was indeed
tested.
>
> '''zone_finder_unittest.cc'''
>
> - I'd avoid initializing nsec3_hash_map in the namespace level. I've
> committed my suggested change to address this (2fdacd8).
This whole thing is gone now, as `NSEC3Hash` is now used and
`faked_nsec3.cc` is mostly untouched except addition of some data.
> - should this be removed?
> {{{#!cpp
> // This needs to be updated and checked when implementing #2118
> }}}
Gone now.
> - this is invalid if salt_data_2 is empty:
> {{{#!cpp
> (std::memcmp(&salt_data_2[0], salt_data, salt_len) !=
0)) {
> }}}
These are tested for length now.
> - I don't understand this comment:
> {{{#!cpp
> // We assume that rrsig has already been checked to match rrset
> // by the caller.
> }}}
It was an error in pasting some `RdataSet::create()` code from elsewhere.
Removed.
> - I don't think this test really does what it claims:
> {{{#!cpp
> // Only having NSEC3PARAM isn't enough
> addZoneData(textToRRset("example.org. 300 IN NSEC3PARAM "
> "1 0 12 aabbccdd"));
> }}}
> because in the actual implementation, "only having NSEC3PARAM"
> enables `NSEC3Data`. So findNSEC3 actually wouldn't throw.
I have checked this. It indeed throws in the tree node count test at the
top of `findNSEC3()`. If that test is removed, it asserts at the bottom of
that method.
> - the comment doesn't really seem to be 100% accurate. It's indeed
> for `InMemoryZoneFinder`, but in this context it's NSEC3 specific.
> {{{#!cpp
> /// \brief Test fixture for the InMemoryZoneFinder class
> class InMemoryZoneFinderNSEC3Test : public InMemoryZoneFinderTest {
> }}}
Updated.
> - can you add some comments on findNSEC3Walk and its the data? What's
> the point of the test, meaning of `TestData` member variables, how
> the hash values are chosen (whether they are randomly chosen or
> intended to test some specific case - I suspect the latter, but it's
> not immediately obvious from the definitions).
Added.
>
> '''nsec3hash'''
>
> Frankly, I don't like this hack. It breaks some design principles of
> the class (e.g., reusing internal resources for multiple hash
> calculations - for that matter, this branch doesn't update the
> documentation on this). Also, defining `NSEC3Hash::calculate` for
> SHA1 only seems ad hoc.
>
> To me, a better approach is to extend the factory interface so that we
> can create an `NSEC3Hash` instance from the algorithm, iterations and
> salt.
>
> the in-memory implementation would have a function pointer to a
> specific factory instance for the normal and test cases, not the
> calculation function itself.
This hack was removed and replaced with another `NSEC3Hash::create()`
variant which is much cleaner. I wish I had thought of doing this before.
The in-memory implementation does not keep any function pointers now, as
the code in `NSEC3Hash` itself has a way to provide fake hash calculation.
>
> '''zone_finder.h'''
>
> nsec3_calculate_: In general, I don't like to introduce a public or
> protected member variable, especially if it's mutable, as it can
> easily break class integrity due to a careless application or derived
> class implementation. In this case, if there's no better way to allow
> tests to use a faked calculator, please at least leave comment that
> this protected member variable is only intended for test purposes
> (isn't it?), and other derived class shouldn't try to tweak it. Also,
> I suspect the in-memory finder class isn't really expected to be
> further derived except for testing. If that's the case, please also
> document that.
This code is gone now due to the use of `NSEC3Hash`.
>
> '''zone_finder.cc'''
>
> In addition to other points that were already discussed, I've noticed
> the same patter like this repeated:
> {{{#!cpp
> const RdataSet* rdataset = node->getData();
> assert(rdataset != NULL);
> assert(rdataset->type == RRType::NSEC3());
>
> ConstRRsetPtr closest = createTreeNodeRRset(node, rdataset,
> getClass());
> }}}
> and it makes the method longer and less readable. I'd introduce a
> helper function to unify the logic:
> {{{#!cpp
> ConstRRsetPtr
> createNSEC3RRset(const ZoneNode* node, const RRClass& rrclass) {
> const RdataSet* rdataset = node->getData();
> // add comment why we can assert these
> assert(rdataset != NULL);
> assert(rdataset->type == RRType::NSEC3());
>
> return (createTreeNodeRRset(node, rdataset, rrclass));
> }
> }}}
> and replace the above with this:
> {{{#!cpp
> ConstRRsetPtr closest = createNSEC3RRset(node, getClass());
> }}}
>
Done. :)
--
Ticket URL: <http://bind10.isc.org/ticket/2218#comment:14>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list