Recommended setup with large cache memory

Danny Mayer mayer at gis.net
Mon Sep 12 12:47:48 UTC 2005


Brad Knowles wrote:
> At 1:17 PM +0200 2005-09-09, Attila Nagy wrote:
> 
> 	Query comes in.  Look up the answer in the cache.  Answer is 
> there, but expired.  The server will go ahead and re-query for that 
> information, give you the answer that comes back, and store that in 
> the cache.  This is the way it works today, but there is no memory 
> recovered.
> 

The cache cleaning code was quite complicated and the need to get it 
working with threading made it more difficult to implement correctly.

> 
>> - if the memory is full, simply drop, or overwrite the "older" entries.
>> For example if you have 1000 "slots" if the number of entries reach 1000,
>> the first will be the next available slot.
> 
> 
> 	How do you know which ones are oldest?  You'd have to search the 
> entire in-memory database to find that out, which is the largest part 
> of what the standard cache cleaning process is already doing.
> 
> 
>> - somehow make the expire process adaptive or "rate limited". It seems
>> a little bit odd, to have very degraded performance while the expire runs.

It's much more straightforward to run the cache cleaning in a separate 
thread. The issue there is you have to deal with locking. It's figuring 
out the locking that's one of the biggest difficulties and figuring out 
what not to lock that's even harder. Locking is expensive, but not 
locking can be disastrous.
> 
> 
> 	Garbage collection in any application is almost always an 
> expensive process, unless you can design the system so that a little 
> bit is pretty much constantly being done, so that you never have a 
> large build up that you have to then spend a lot of time to go 
> through and clean out.
> 

The larger it is, the longer it takes. Unfortunately that's not an easy 
task because the busier a server is the faster the cache builds up and 
therefore the longer it takes to clean out old entries.

> 
>> I will try threading enabled to see whether it improves this 
>>situation or not.
> 
> 
> 	If you're running on FreeBSD, Jinmei's report has demonstrated 
> that turning on threading is not likely to help you.
> 
> 
The problem that Jinmei demonstrates is that the cost of locking is a 
big determining factor in how useful threads are to you. Locking is 
expensive, but how expensive it is depends on the O/S. However, in order 
to use threads you MUST use locks, so it's a balancing act. You want to 
minimize the number of locks and locking/unlocking cycles, but you need 
to make what is being locked small enough so another thread doesn't have 
to wait too long for a lock to be released or it will adversely impact 
performance. Windows threads and locks happen to be very good, but then 
Windows requires it and threads are used everywhere. BIND on Windows 
ONLY comes threaded because the I/O Completion Ports requires it plus 
some other parts of the implementation can only work with threads.

So programmers face two problems when implementing threads:
1. What to long and for how long? (You want to make the latter as short 
as possible; and
2. Debugging. Debugging threads and locks is very difficult because it 
can be hard to tell what thread is running in any moment and what the 
other threads are doing at the same time.

There are advantages to using threads in BIND, aside from being able to 
take advantage of multiprocessors. It can be responding to queries, 
cleaning cache, reloading a zone, transferring a zone and continue to 
respond to queries at the same time. It can't be responsive if it's not 
multithreaded even if overall performance is better.

Danny



More information about the bind-users mailing list