* Proposal for a BIND 10 shared cache resolver ** Design goals: - simple, but powerful design - possible to assign a cache module to one or more resolvers in a NUMA architecture to make sure the DNS cache data resides locally in the same CPU cache that the resolver uses - possible to operate cache modules on dedicated machines in a BIND 10 cluster environment - reuse proven existing (DNS) technologies and tools (known to DNS operators), reducing support and training investment - flexible configuration (all local, mixed local and remote caches) ** Design summary: - communication between cache module and resolver modules using the DNS protocol (over IP or other IPC) - cache module implemented as a special purpose DNS server ** Design detail *** Cache module - the cache module is a dedicated DNS server, similar to the "Auth" module, but instead serving resource records from files it serves DNS records from a cache memory structure - the cache module "listens" on DNS queries on regular DNS INET socket (UDP/TCP port 53), but can also configured to use fast IPC methods available on the OS platform (unix domain sockets, named-pipes, shared memory ...) - the cache module does not implement any resolver function. It can only lookup records in the cache. Successful lookups will be returned to the requestor (NOERROR or NXDOMAIN), cache misses return a RCODE 9 (NotAuth) (or another RCODE that makes sense to indicate a cache miss) - the cache module can receive dynamic DNS updates. These updates will update the internal DNS cache memory structure. Dynamic DNS is the way how resolvers (and operators) will maintain the cache data. *** Resolver module - a resolver module can have one or more cache modules configured. The cache module configuration has an order ("local" cache modules are queried first, "remote" caches later) - cache modules can be configured to be asynchronous, so that the resolver can dispatch queries to more than one cache module at a time (concurrent cache queries) - the resolver will "forward" DNS queries to the cache modules in order or parallel. If no cache module responds with a positive (NOERROR or NXDOMAIN) answer, the resolver will start iterative recursion. For performance reasons, it might be possible to configure a resolver module to do "speculative" recursion, that is to start recursion at the same time that the caches are queried. The first positive answer received (either from cache or from recursion) will be used, all other queries (cache or recursion) will be stopped - the resolver will do DNSSEC validation - on successful iterative resolution, the resolver will update the cache using dynamic DNS update - if both the cache module and the resolver are on the same physical machine, they should use fast IPC available to that OS platform *** cache maintenance Assumtion: DNS wildcards "*" in resource records only appear in authoritative zones, never in cache data (as the wildcard is expanded by the authoritative server). Please correct me if a wildcard can be seen in a cache. - the cache content can be inspected by normal DNS queries (dig or similar tools) - the cache content can be listed by a "wildcard zone transfer" (dig (at)server "*.domain.tld" AXFR) - the cache content can be updated by dynamic updates (implements the equivalent to "rndc flushname": "update delete domain.tld") - a DNS tree hierachy in the cache can be removed by using a special "wildcard update": "update delete *.domain.tld" (implements the equivalent to "rndc flushtree") - operators can "spoof" their cache by entering records into the cache with high TTLs - security for dynamic updates over network is provided by TSIG ** benefits - DNS operators and developers can use existing DNS lookup tools (dig) to examine the content of the cache and troubleshoot the operation of the cache module - DNS operators and developers can use existing DDNS update tools (nsupdate) to maintain the cache - existing sniffing tools (tcpdump, Etheral, snoop) that understand the DNS protocol can be used to inspect the communication between resolver modules and caches ** Example for a NUMA architecture configuration: machine with 32 cores, 8 groups of 4 cores that share the same local CPU cache - 24 resolvers started (3 pinned to each CPU group) - 8 cache modules started (1 pinned to each CPU group) - resolvers are configured to query their local (on the same CPU group) cache module first, then a cache that is on a different CPU group - if none of the local cache modules know an answer, they could be configured to dispatch a query to a external cache system in the same datacenter, wich might be still faster than recursive resolution ** Problems - DNSSEC validation is done in the resolver module. the result of a DNSSEC validation should be also stored in the cache, to prevent costly "re-validation" of already validated data. The DNS protocol does not have a default mechanism to update the state of DNSSEC validation over DDNS. It might be possible to augment the record update with a special, private record type that will trigger the result of DNSSEC validation to be stored in the cache. For signalling successful DNSSEC validation from the cache to the resolver, the AD flag could be used.