9.5.0-P1: out of memory error - SERVFAIL

Kai Lanz lanz at stanford.edu
Tue Aug 12 18:36:26 UTC 2008


We run BIND 9.5.0-P1 on an Alpha Tru64-4.0G box. Yesterday morning
our named suddenly stopped serving requests, although the daemon
had not died; the log file showed two entries for every request:

Aug 11 04:01:14 pangea named[10832]: resolver.c:3239: unexpected error:
Aug 11 04:01:14 pangea named[10832]: isc_timer_create: out of memory

There were tens of thousands of these. Dig shows that named is returning
status SERVFAIL for every request. Top shows that named is using a huge
amount of memory: resident set size over 700MB.

I tried to fix this by setting max-cache-size to 32M in the options  
section
of our named.conf and restarting named, but that doesn't seem to have
helped -- in 24 hours, named's memory usage has risen to 400M and is
still climbing. Is there a way to fix this problem, or at least  
control named's
behavior?

Another possibly-related observation: At this point, with named having
run for a day and using 400M of memory, I decided to take a look at the
cache to see if it was growing wildly, so I ran "rndc dumpdb".This saved
a cache_dump.db file of about 5M, which I guess is reasonable, but it
also caused named to dump core and die. The core file is incomplete
because there wasn't enough room in /etc to save it all. The log file  
didn't
record any error when named crashed; the last entry is just "dumpdb
complete".

After restarting named, I tried "rndc dumpdb" again. This time it worked
fine, saving a 700K cache file; no crash, no core dump.

The dumpdb crash worries me, but we really need to fix the out of
memory problem. Any suggestions?

-- 
Kai Lanz



More information about the bind-users mailing list