BIND 10 #192: Data source hotspot cache
BIND 10 Development
do-not-reply at isc.org
Wed Jun 16 04:15:55 UTC 2010
#192: Data source hotspot cache
-------------------------+--------------------------------------------------
Reporter: each | Owner: jinmei
Type: enhancement | Status: reviewing
Priority: major | Milestone: 05. 3rd Incremental Release: Serious Secondary
Component: b10-auth | Resolution:
Keywords: | Sensitive: 0
-------------------------+--------------------------------------------------
Comment(by jinmei):
Review is ongoing, but I'll make my comments cache.{cc,h} with a couple of
general comments first.
general:
- I've made some trivial (I think) cleanups directly on the branch.
- documentation is much better than other datasrc files (btw we need it
for them also), but IMO it's still not sufficient. See specific points
below. Also, I suggest you actually convert the doxygen markups into HTML
files and browse the results.
cache.h
- CacheEntry class
- why is this defined in the header file? This seems to be an
internal type only used within the CacheNode class. If so, this should be
hidden from others (move it to .cc). Plus, although this largely a matter
of taste, this looks more like a "struct" rather than a "class". If this
must be defined in a public header file, I strongly recommend hiding
member variables as private. This should be pretty obvious, but to be
redundant: as we can see in my experimental rbt patch, hiding the details
will help us when we want to customize it further, without worrying about
breaking compatibility.
- documentation: commenting about the actual implementation would also
be useful, but I'd first like to see a conceptual description of "what
this is".
- Cachenode class
- I normally expect more general documentation about a class of this
size (overall design, any non trivial design decision, etc). I'd also
expect exception considerations for each public member function.
- isValid()/getRRset/getFlags() aren't explicitly tested. they should.
- I'm not sure if copying/assignment is expected to be allowed for
CacheNode objects, but if not, they should be explicitly prohibited. (it
seems to be CacheNodes are generally passed as (shared) pointers, and can
be non-copyable)
- especially if this can be non copyable, I suggest you consider using
pimpl for this class to reduce dependency on the DNS related class
implementations in the header file. I know it's a common controversial
argument of ours. And I also understand this may be a case where the
benefit of pimpl may not outweigh its overhead because this class is for
performance optimization. But since this is cached information and the
overhead of creation time should be relatively minor, I still personally
think we should use pimpl here. See also the related comment on HotCache.
Note also that you've already partially adopted pimpl for this class via
the entry_ member variable.
- We should really, really, seriously reconsider the current design of
making some of the class member variables public (from prev to qtype).
This is something we should always avoid unless there's a very strong
reason to do so, and I don't see it in this case. Actually, this is
mostly private information only used by HotCache, so it's much better to
encapsulate such implementation specific information in CacheEntryPtr,
define it within .cc, thereby hiding the existence of such notion from
others in the first place.
- QTuple class
- AFAICS, this is only used within the cache implementation. it should
be hidden in the implementation file. I noted the TODO, but unless/until
we really do this it's better to hide it.
- On the other hand, if we really need a public class like QTuple,
dns::Question is almost exactly what's needed here. Why not reusing it?
(we could add operator< if that's the problem).
- Constructor: why passing object copies (which is expensive), rather
than references?
- document operator<. why we need this etc. note also that
Name::operator< is quite expensive (it calls the generic compare()
method).
- (maybe infeasible if we accept the first comment but) we need tests
for this class.
- is this (intended to be) copyable? If not, make it non copyable
explicitly.
- HotCache class
- We tend to abbreviate things, but isn't it better to name it more
clearly, e.g., HotSpotCache?
- Class description: describing internal implementation (using
std::map/double linked list, etc) is good, but I'd separate the
documentation for the high level interface from that for the actual
implementation. For example, entries are managed using a LRU-based
algorithm is high level, while it's implemented via a doubly linked list
is an implementation detail. I personally separate the latter into a
different paragraph, beginning with "note to developers".
- Class description: like Cachenode, I'd add more design decisions, non
trivial points, etc. That would include why the default lifetime is 30
seconds, cache and negative cache, and the rationale of using std::map,
what are positive and negative caches, etc.
- As for design decision, I guess we need to consider multi-thread
considerations. Although we are currently not using threads, and I'm not
sure if we'll do in our auth server in future versions, our implementation
so far is (in my understanding) generally thread safe, except, perhaps
within sqlite3/ASIO library implementations. So the access to this cache
is going to the first place that is not thread safe. We should at least
note that, and describe our future plans on that point. Note also that
the data source library is public, and some external developers may want
to use it with threads.
- Class description: I'm not sure if it's a good idea to mention
DataSrc::doQueryTask() here, because the cache may be used in other
contexts. I'd rather make the description generic, while noting
doQueryTask() uses the HotCache in its own description.
- Design: I don't think making this as a static is a good idea. One
practical reason for this is that we may actually want to have multiple
different caches if/when we support views. And, of course, use of
static/global objects should be avoided at all costs. In this specific
case, I don't see an inevitable reason why we must define it as a non
local static object. The cache isn't needed unless query processing
happens, so it should be reasonable to assume the application can prepare
it by the time it needs the cache. I strongly suggest you reconsider this
design.
- A related point: I saw comments about the initialization of
Query::cache(0). Trying to avoid the static initialization order fiasco
just by noting it in a comment isn't a good practice. There are many non-
trivial pitfalls around this beast, so non-trivial that we can't avoid
them just by hoping the developers read the comment and do the right
thing. If we really, really, really need it as a static object, we must
avoid the fiasco more explicitly using language features. But again,
please reconsider the design in the first place before trying to make it
work with complicated tricks.
- Design: now the controversial point:-) I strongly suggest pimpl be
used for this class. This class is a perfect example of the case where
the advantages of pimpl outweigh its overhead: having an external
definition (std::map) inside the implementation, and the overhead of
constructing/destructing the object is negligible. I now hope we've all
learned how advantageous it is if we minimize dependencies on other
implementations in our public header files via the turmoil with the non
boost ASIO, and I hope my argument is now more convincing.
- I believe the HotCache class is intended to be non-copyable. Please
explicitly make it so.
- description for member functions in general: same comments as that for
Cachenode.
- naming: if we follow our method name convention, cache() is better
named setCacheEntry(), s/ncache/setNegativeCacheEntry()/,
s/retrieve()/getCache, etc? Maybe a matter of taste, just a comment.
- description about cache(): it should describe what happens if there's
already an entry that matches the key. Also describe what if it's a
negative cache. These case should also be tested.
- description about ncache(). Likewise.
- methods like retrieve() should normally be definable as a const member
function. Hmm, okay the implementation explains why this can't be a
const. The method description should explain that.
- retrieve() signature: why passing object copies?
- setSlots description: it states "up to the number of RRsets in the
data source". I don't understand this. Which data source is it?
Whatever it means, since there can be negative cache entries, this
description still doesn't make sense to me.
- constant of 30: I'd define it as a static const class member with
doxygen description, and use it other places consistently, so that the
magic number appears only once (and we make sure it's described).
- count_ might be redundant because it should be identical to
map_.size().
- LRU list: why using in-house list, rather than std::list? (see also
the comment about the implementation).
cache.cc
- CacheNode::~CacheNode(): if you use the default destructor you can
omit the declaration and the definition.
- HotCache::~HotCache(): I don't understand what the for loop does.
Does this do something correct? Also recall my other comment about
why using in-house list. If we use std::list, we don't even have
to bother to clean up these things.
- HotCache::retrieve(): map::operator[] creates a new entry if it
doesn't have one for the key. It's probably not necessarily
incorrect, but I suspect it's error-prone behavior. For example,
consider the case where the caller continuously calls retrieve()
for different non-existent keys. I suggest you use map::find()
instead.
- HotCache::insert(): why do we need to have a while loop?
if count_ > slots_ > 0, count_ must be slots_ + 1, and we should
only need a single remove operation. if this doesn't hold it's an
internal bug, and I'd rather let it fail with assert (which I
prefer) or exception than preparing for the "impossible case" with
a complicated logic.
- HotCache::ncache(): the condition of ANY()s isn't documented (and
not intuitive).
--
Ticket URL: <http://bind10.isc.org/ticket/192#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list