9.2.5 db causes high cpu? was: Re: BIND 9.2.5rc1 is now available.

Paul Vixie paul at vix.com
Mon Feb 21 16:44:06 UTC 2005


> 	I don't think I'd make this a compile-time option.  One of the 
> good things about BIND is that it is capable of running both 
> authoritative and recursive services on the same process/IP address.
> 
> 	Many years ago, I recommended that people run split servers, 
> where you had recursive-only servers on one set of machines and 
> authoritative-only servers on another set, and you would not try to 
> mix the two services.  But there are some cases where people are 
> running machines in environments where the number of systems 
> available to them are limited, or where the number of IP addresses 
> are limited, and if you have a server like Nominum ANS or Nominum CNS 
> that is only capable of handling one job or the other but not both, 
> then you're screwed.
> 
> 	BIND allows you to handle non-optimal situations like this.  Some 
> other servers don't.  I consider this a key advantage of BIND.

i don't.  what should a server do if it is authoritative for the ISC.ORG
zone but not the LAB.ISC.ORG zone, and a query arrives for SOL.LAB.ISC.ORG?

a correctly functioning recursive server would forward the query, wait for
the response, cache the response, and forward the response.

a correctly functioning authority server would answer with a referral.

clearly a server that is both recursive and authoritative has to do one
or the other, and so it will be wrong.  this is the kind of thing that
makes otherwise-good servers look bad to tools like "doc".

this dual-purpose name service idea was silly when first implemented and
we would remove it in a heartbeat if most of the world weren't depending
on it.  (this is analagous to the multiple-cname logic.)

i thought of writing a BCP on this topic years ago, but when i sat down
to draft the document it came out "don't add silly things to your software,
and if your software vendor provides silliness, avoid using it."  not very
helpful.

> 	However, this does tend to lead people to misconfigure their BIND 
> servers, so I think it would be a good idea for future versions of 
> BIND to come up in a "default secure" mode.  Whereby, if you 
> configure your server to be authoritative for any zones beyond the 
> standard ones for "0.0.127.in-addr.arpa." and "localhost.", then the 
> server should refuse to perform recursion.  Likewise, if you 
> configure the server to handle recursion, then it should refuse to 
> answer queries from outside your network, and any other things that 
> are normally appropriate for recursive-only servers.

sadly, that would break far more things than it would fix.  as behaviour
that can be enabled with a keyword it sounds good in general -- but not
as a default.

> 	In other words, instead of making recursive-only or 
> authoritative-only compile-time options, instead make them default 
> operational modes which are automatically detected and implemented by 
> the software, but allow people to explicitly configure their server 
> so as to provide both functions, if they do the right "wave a dead 
> chicken" dance.

according to www.isc.org/ops/ds/, bind runs >90% of all nameservers
in the internet.  according to my experience, >89% of them depend on
this dual-purpose silliness.  we really can't change the default.  we
could, however, write up a strongly-worded release note on this topic.

> >  I followed Jinmei's recommendation of disabling threads and trying
> > to run just one BIND9 process at first. So i am currently measuring
> > 2 BIND8 processes against 1 BIND9 process on two identical machines.
> 
> 	With threading disabled on the BIND9 process, so it should be 
> significantly slower than the two BIND8 processes, but at least you 
> should get a clearer picture as to what else is going on.

that's nonsequitur.  the driving issue in recursive nameserver performance
isn't CPU speed or coding logic, it's RTT to distant nameservers or your
own upstream network connection.  a carefully crafted lab-test of queryperf
through a recursive server with all the authority zones reachable inside
the lab network *might* show that two BIND8's are faster than one BIND9,
but even that's not a given.  BIND8's retry logic is very different from
BIND9's, and a given query might be answered in far more or far less time
under stress conditions on one of these servers than on the other.


More information about the bind-workers mailing list