No matter what, what gets you the best performance is what you should
choose to do.

BIND 9 does not currently benefit as much from more cores, but does from
more powerful cores.  I suspect this is what is happening in your tests.
 The 4 core machine is able to be fully utilized, and each core is
faster.  In the 8 core machine, you are not only doubling the amount of
data needed to be moved around in some cases, you are lowering the core
speed.  If BIND uses perhaps 6 of the cores, with the typical 64-bit
overhead of more data and more effort on the system, you are losing

Also, like mentioned by several here, you are not using identical
machines.  When you benchmark and compare results, you want to minimize
the differences or at least account for them.  Your machine is not just
made of CPUs, it has memory (which may vary in speed), network cards and
the drivers (which may be better or worse depending in the card and
drivers), and motherboard chipsets in use.

There are just too many variables to account for in your comparison that
we cannot really help with.  Thus, the only answer that is ultimately
important to you is, on this new hardware, how can I make BIND 9 run
fastest?  If that turns out to be a 32-bit OS or a 32-bit named binary,
that's your answer.

