BIND 9.3.0 issues regarding shutdown and memory usage

Tue Jan 25 01:35:44 UTC 2005

> Hello everyone,
> 
> excuse me if those questions have already been answered, I could not
> find something by searching through the archive but I could as well have
> used the wrong keywords.
> 
> I'm currently trying to use BIND 9.3.0 as an IPv6-enabled resolver for
> our mail-setup. It has one big slaved zone (rbl-plus.mail-abuse.com,
> size of the zonefile is about 110M), everything else is cached. It is
> running on a Dual-Xeon (HT-enabled) on SLES9. I compiled it myself using
> no specific options except --with-threads.
> 
> I have found two problems I could not really fix so far:
> 
> a) When I issue a 'rndc stop' BIND gives me the following output when
>    started with "-g -d 2":
> 
>    24-Jan-2005 16:42:53.872 received control channel command 'null'
>    24-Jan-2005 16:42:53.872 received control channel command 'stop'
>    24-Jan-2005 16:42:53.872 shutting down: flushing changes
>    24-Jan-2005 16:42:53.872 stopping command channel on 127.0.0.1#953
>    24-Jan-2005 16:42:53.873 no longer listening on 10.156.6.10#53
>    24-Jan-2005 16:42:53.873 calling free_rbtdb(.)
>    24-Jan-2005 16:42:53.873 done free_rbtdb(.)
>    24-Jan-2005 16:42:53.875 calling free_rbtdb(.)
>    24-Jan-2005 16:42:53.875 done free_rbtdb(.)
>    24-Jan-2005 16:42:53.879 calling free_rbtdb(rbl-plus.mail-abuse.org)
>    24-Jan-2005 16:42:53.880 calling free_rbtdb(.)
>    24-Jan-2005 16:42:53.880 done free_rbtdb(.)
>    24-Jan-2005 16:42:53.880 calling free_rbtdb(.)
>    24-Jan-2005 16:42:53.882 done free_rbtdb(.)
> 
>    Here BIND sits eating 100% of one of the four logical CPUs and does
>    nothing for at least 15 minutes (maybe longer, I killed it after that
>    period). It doesn't listen to rndc anymore (of course) and does nothing
>    I can see. SIGHUP, SIGTERM or any other signal except SIGKILL doesn't
>    help a bit).
> 
>    This behaviour is consistent whether I used threading or not. No
>    difference here.

	Known issue addressed in BIND 9.2.5/9.3.1.
 
> b) Something I couldn't reproduce so far, but perhaps someone has an
>    idea what could cause something like this. During the weekend (two days
>    up) the threaded binary went for 1.5G used memory. After restart all
>    versions are at about 500M, after one day hard testing maybe 700M.
> 
>  1.) In this state (1.5G used) I could not get it to use a single byte
>      more memory even when running some 100k random queries through it.
>      I read somewhere that this is the process limit of x86 Linux, any
>      idea how to increase that? The box has 4G RAM and is dedicated for
>      that service, so I'd like to use it
> 
>      Kernel 2.6.5-7.111.19-bigsmp #1 SMP (SLES9 distribution kernel)
> 
>  2.) Memory consumption seems to be way too high, the box is currently
>      during test doing about 5 qps, even when manually running 400k
>      queries dumped on another resolver I can't get it past 700M. But at
>      this particular runtime it had 1.5G and stayed there.
> 
> I would be glad about any tips you have to offer

	When named transfers a zone it make a new version of it in memory.
	When the transfer is complete it replaces the old version with
	the new version and frees the old version.  Repeat for next transfer.
	Add to that some memory fragmentation and you get what you see
	here.

	You could try defining ISC_MEM_USE_INTERNAL_MALLOC and rebuilding
	mem.o (libisc) and see if that behaves differently.

> Bernhard
> 
> 
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark_Andrews at isc.org