[bind10-dev] recursor cache requirements - input required

Fri Dec 10 15:12:07 UTC 2010

Stephen,

> On 8 Dec 2010, at 13:27, Shane Kerr wrote:
> > On Tue, 2010-12-07 at 16:55 +0800, Likun Zhang wrote:
> >> The key for one rrset in cache should be "Domain_name + Type + Class". In
> >> the value part, besides of the rdata of each rr in the rrset, there should
> >> be rrset's signature(rrsig record), if it has, and the security
> >> status(dnssec validation result) of rrset.
> > 
> > It is very unlikely that the cache will ever be used for any class other
> > than IN. I suggest that we restrict the cache to a single class, in the
> > interest of saving 16-bits per entry plus associated processing time.
> 
> If we make the cache a single C++ class, we could always create other
> instances of it for different DNS classes - although we want to be
> aware of a malicious attack whereby an attacker could force the
> creation of 64k instances of the cache (see below).  Should we go this
> way, it will have an impact on the NSAS.  At present the NSAS stores
> all classes in one store; if we explicitly take DNS class out of the
> cache, we should do the same for the NSAS.

Personally I really think we should just support class IN on the
recursive side. CH is useful for getting ID.SERVER and the like, but
nothing useful in the caching side, and the only other class is HS which
MIT can send us a patch for. We can't get those 16 bits back from the
protocol, but we can get them back in our implementation!

> >> ==== dumping/loading to/from one document/database ====

> > Because the cache can be quite large, we need to define the behavior
> > when the cache is being dumped. I suggest that the cache should not
> > block add/remove operations when this is going on.
> > 
> > Actually, it might be possible to act in two ways: allow add/remove
> > operations for when dumping or loading during runtime, and disallow for
> > faster, lock-free operation when starting up or shutting down.
> 
> For dumping, could we not fork the process and have the child write
> the copy of the cache at our leisure?  That would work on Unix-based
> systems at least.

Yeah, that would work quite good, as long as we insure that we don't
have lots of dumps in progress. My main interest is insuring the
requirement of not having dumping affect normal operation is
documented. :)

> As to loading, it depends how long loading will typically take.  If a
> short time, we can construct the cache before we begin operations; if
> a long time, a separate component that works through the dump in the
> background and does an "add if not here and if the stored data has not
> yet expired" should work.

Well... I think *typically* loading will take a couple seconds. But for
larger caches - where reloading cache is the most helpful - it could
take several minutes. On my laptop I just timed it taking about 40
seconds to cat a 700 MB ISO to /dev/null. Even half that long would feel
like AGES to a system administrator trying to get service back
online. :)

So I think doing the loading as you describe makes sense.

> > We may also want a way to completely empty the cache. If nothing else
> > this can be useful in debugging. :)
> 
> If the cache is a single object, how about deleting it and creating a
> new instance?

Okay makes sense, as long as there is a user-visible way to do this.

> >> the value for one message should include message header, index information
> >> for each rrset in different message sections. For the structure, see the
> >> following sketch. The security status(dnssec validation result) of the
> >> message should also be noted.
> > 
> > It might be nice to include versions of this data with name compression
> > too, right? To avoid having to perform this processing again.
> 
> The compression technique involve pointers to absolute offsets in the
> message.  If an RRset is found in more than one message, (potentially)
> we will have to have one compressed version of the RRset for each
> message it is in.  It may well be simpler to cache the wire format of
> each message as-is and create associated index structures mapping it
> to the constituent RRsets (and vice-versa).  Then if any of the RRsets
> is updated (other than a TTL update that we can explicitly write back
> into the message), drop the message and re-query.

I wasn't implying using compression at the RRset level, but pretty much
exactly as you describe. Store the compressed data of the message, and
treat TTL updates of the RRset as distinct from RDATA updates.

We may have to rebuild the message and then re-compress it, but I don't
think we have to re-query, since the message is something we are putting
together though.

--
Shane