[bind10-dev] recent performance improvements

Tue Mar 20 13:13:14 UTC 2012

On Mon, 19 Mar 2012, JINMEI Tatuya / ???? wrote:

> To be clear: if you mean "disabling threads" by non-optimum, comparing
> to "without specifying -n", then I would say it's different from
> "non-optimum".  It's just unfair (or misleading) comparison:
> 
> - disabling threads: uses exactly one core
> - enabling threads, not specifying -n: uses as many cores as available

Okay. I am now specifically setting it to -n 1 by default. The updated 
performance results (bind9 without threads is slower then bind9 built 
with threads and -n 1):

builtin--.nxdomain 3202.635569 qps 3% slower
builtin--.soa 3408.178849 qps 5% slower
builtin--.success 15266.703820 qps 6% slower
root-memory-.nxdomain 38362.727878 qps 38% slower
root-memory-.soa 25547.769731 qps 56% slower
root-memory-.success 16004.849085 qps 68% slower
smallzone-memory-.nxdomain 40865.231371 qps 34% slower
smallzone-memory-.soa 35941.285741 qps 42% slower
smallzone-memory-.success 37586.834986 qps 39% slower

(By the way, this is different than I had found a couple years ago, 
where the possible overhead of one thread caused worse performance than 
no threads).

> In the context of comparing BIND 9 and BIND 10, it just doesn't make
> sense to compare "enabling threads without -n" BIND 9 with a
> single-process b10-auth if the machine has multiple cores.

Single b10-auth versus named built with threads using -n 1:

builtin--.success 25657.951995 qps 68% faster
root-memory-.success 19102.104800 qps 19% faster
builtin--.nxdomain 3599.940615 qps 12% faster

root-memory-.soa 19025.823065 qps 26% slower
smallzone-memory-.nxdomain 30719.644537 qps 25% slower
root-memory-.nxdomain 29075.203829 qps 24% slower
smallzone-memory-.soa 28879.885552 qps 20% slower
smallzone-memory-.success 30725.390836 qps 18% slower
builtin--.soa 3166.610208 qps 7% slower

I need to redo my research of BIND 9 comparing no -n and -n 1 through 
-n32. Last time I did it on same system with older BIND 9, named 
detected cpus was 84% the speed of the best -n number. Since you 
reminded me about -n 1 again that may explain why -n 4 was slower in 
some cases versus no -n. So maybe a newer BIND 9 improved this 
autodetection or it better used a different amount of worker threads.  
Then I will use the best -n to compare against best b10-auth amount.