Recommended setup with large cache memory

Brad Knowles brad at stop.mail-abuse.org
Fri Sep 9 14:15:58 UTC 2005


At 2:26 PM +0200 2005-09-09, Attila Nagy wrote:

>  Do you have benchmarks between Linux and FreeBSD, with and without
>  threading? As you say, the above is old, both parties have evolved since.

	No, I don't generally have much to do with Linux.  There are some 
machines I help administer that run Linux, but I'm just one of many 
people, and my role there is very limited.  I don't run Linux inside 
the house, and even if I were to try to install it on my machines, I 
have zero confidence that I'd be able to create a configuration that 
would be able to perform as well as it should.

>>>  See http://www.danga.com/memcached/ for details.
>>
>>  	I am familiar with memcached.
>
>  What is your opinion on using that to store the cached data?

	I see no advantage to using it over the in-memory database that 
BIND already uses, which is optimized for maximum speed and security, 
because you have to make sure to keep Chinese Walls between the 
authoritative data and the cached data.

	It would certainly be a huge improvement if BIND was having to go 
to files on disk for each and every query, but that's not the way 
BIND works.

>  Should I care? It's a cache. If the needed records are not available, it
>  goes out to the network and do a query.
>  Is this a SPOF?

	Well, to the degree that you care about any of the information on 
any of those systems, yes.

>  Sorry, but I don't understand this. Do people get big, expensive drive
>  arrays for squid caches?

	What do you think a NetApp NetCache is?  It's a big expensive 
RAID array with a custom-written caching web proxy that sits in front 
of it.  The difference is that the NetApp guys are smart, and their 
Write-Anywhere-File-Layout (WAFL) is optimized for use on a RAID-3 
environment, and they've optimized everything else so that the parity 
disk doesn't melt and isn't a bottleneck.

	Otherwise, you'd have to run squid on a pretty huge honking 
machine with some damn expensive EMC or Hitachi storage area network 
devices.

>  Is an entry, which can be retrieved from the network anytime valuable,
>  which needs extra protection besides its integrity?

	If you care anything about latency, yes.  Otherwise, you might as 
well not run any kind of cache at all, and just always get all your 
data across the network.

>  if you have a virtual IP address with a load balancer, which routes the
>  queries to a number of caches you will have inconsistency in the answers.
>  For example it will be possible that the first query for mx.domain.com
>  will be negative (because in one of the caches there is an entry for it)
>  and the next one will give an IP address.

	This depends on the load-balancing switch that you're using. 
Some switches can be configured for affinity priority, so that a 
given query coming in for a given target will always go to the same 
back-end server, or set of back-end servers, unless they are down or 
have been taken out of the rotation.  In which case, the query will 
be directed to one of the other back-end servers, and that machine 
will now continue to have the affinity for queries for that target.

>  With a simple queryperf "benchmark" I could do about 35k qps on an UP
>  machine, if I query the same (cached) A record. This performance doesn't
>  really changes with the cache size in use.

	You can get those kinds of numbers if you test with a copy of 
much larger zones, too.  I used .tv in my testing, specifically 
because it was guaranteed to be larger than the physical memory of 
the server I was working with.  But higher-end boxes are going to be 
able to keep all of .com or .tv in memory, at which point those kinds 
of query rates could continue to be expected.

>  BTW, a commercial product could handle the same (production) load with
>  about 3% of CPU usage, while bind still ate about 30-40% on that machine
>  (after upgrading to 9.3.1).

	Yeah, products like Nominum's ANS and CNS are designed for much 
higher performance.  They are next-generation programs, way beyond 
the capabilities that you could hope to have in any open-source 
product.  And you pay money to get that kind of speed.

>  I've already seen your paper. I think it would be interesting to repeat
>  that experiment.

	I'm in the process of doing that, but I'm not going to try 
testing different OSes, because I know that I'm not qualified to do 
that.

-- 
Brad Knowles, <brad at stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

   SAGE member since 1995.  See <http://www.sage.org/> for more info.



More information about the bind-users mailing list