[bind10-dev] Clustering

Thu Jun 18 09:27:09 UTC 2009

Jinmei,

[ Moving explanation of what I think we mean when we discuss 
  "clustering" in BIND 10 to public mailing list ]

On Tue, 2009-06-16 at 20:19 -0700, JINMEI Tatuya / 神明達哉 wrote:
> What about
> - clustering (to me it's still a buzzword...)

Hm... good point.

Many TLDs run a stack of computers at a number of sites, and then
anycast some IP addresses in their NS-set, spread across these sites.
Looking just at the authoritative side, what one has to do now is copy a
slightly different named.conf for each computer at a site (perhaps the
only difference is the SERVER.ID string, but for sites with
heterogeneous hardware you might not want to serve the same data from
each computer). Then a load balancer has to check the SOA of each zone
to make sure that these were within a certain count of each other (so we
don't serve stale data). A separate set of scripts controls advertising
and withdrawing the site from BGP.

On the recursive side, clustering may involve a anycasting an IP address
withing a LAN fabric, using some sort of high-availability (HA) system
to add and withdraw recursive resolvers. In this environment there is no
cache sharing, and again requires co-operation with the HA system.

This is all quite ad-hoc and cumbersome to set up as you can imagine.
Being able to automate this into a single configuration per site (or for
the entire Internet) is what I mean by "clustering".

Once that infrastructure is in place, then there may be sexier things
one can do as well. For instance, in-memory databases are fast but can
consume more memory than is cheaply available - especially when you sign
your zones (since these can be 3 or 4 times bigger than unsigned zones).
An intelligent authoritative cluster can load ownernames starting with
0-H on half the machines and I-Z on the other half, and direct queries
appropriately, reducing the total memory footprint per box by 50%. In a
recursive cluster, you may split the functionality of keeping cache with
the actual resolution process (which can consume a lot of state).

Note that we almost certainly will not have clustering support for our
year 1 deliverable, but as you noted we probably want to consider this
stuff in design phases so we don't have to refactor to support it later.

Cheers,

--
Shane