failed: out of memory

Thomas Schulz schulz at adi.com
Tue Jul 22 15:51:29 UTC 2014


> You'll want to use max-cache-size to enforce a hard limit on the size 
> of your cache.  
> http://www.zytrax.com/books/dns/ch7/hkpng.html#max-cache-size
> 
> /Tim
> 
> ---
> Tim Krzywonos
> e:: tim at krzywonos.ca

Thanks for reminding me of that. Now that I have some confidence
that the problem is the cache and not some funny memory leak, I
think I can rely on setting a limit. I still find it strange that
when I dumped the cache, the resulting file was only 6 MB in size.
The process size had grown to 257 MB, up form an initial size of
36 MB. It does not make sense
 
> On 2014-07-21 10:57, schulz at adi.com wrote:
>>>> Have you tried an rndc flush?  You can also dump the contents of 
>>> the
>>>> cache to find the (approximate) size of the cache.  If related to 
>>> cache,
>>>> you can tweak parameters to cache, most namely max-cache-size.  
>>> IIRC,
>>>> the cache doesn't have a size limit by default.
>>>>
>>>> /Tim
>>>>
>>> I did an rndc dumpdb -cache and the size of the named_dump.db that
>>> resulted is 5927042. Not all that big condidering how it is 
>>> formatted.
>>> Late last night I did a rndc flush. At that time the size of named
>>> was 31305 pages of 8192 bytes. As of now (13 hours later) the size
>>> is still 31305. I will see what happens.
>>
>> See below for our named.conf and then my original description of the
>> problem.
>>
>> As of this morning (3 days 12 hours later) named is still at 31305 
>> pages.
>> So it appears that the continuous growth that I was seeing is due to 
>> the
>> cache.
>>
>> Unfortunately my investigation has not been very methodical. I should
>> have noted the size of the named process when I was getting the out 
>> of
>> memory errors. I also should have noted the rate of growth of 
>> 9.9.5-P1
>> before trying 9.9.6b1. I am going to switch back to 9.9.5-P1 for a 
>> few
>> days and see if the rate of growth is about the same or if it is much
>> worse. (The initial size was 4734 pages and jumped to 7666 within
>> 5 minutes). Assuming that the cache cleaning is working correctly, it
>> may be that a 32 bit process is just not viable these days. I have 
>> now
>> built a 64 bit named and will switch to that in a few days.
>>
>> A big problem is that I will be going on vacation at the end of the 
>> week
>> and I really want to make sure that named does not shut down while I 
>> am
>> away. There is really not enough time to do enough testing to make
>> sure of that. I may set up a cron job to do a daily rndc flush while 
>> I
>> am away.
>>
>>>
>>> I was asked off list for our named.conf. Here it is.
>>> options {
>>>         directory "/var/named";
>>>         acache-enable yes;
>>>         auth-nxdomain no;
>>>         transfer-format many-answers;
>>>         dnssec-enable yes;
>>>         dnssec-validation yes;
>>>         dnssec-lookaside auto;
>>> };
>>> managed-keys {
>>>     dlv.isc.org. initial-key 257 3 5 .....;
>>> };
>>> managed-keys {
>>>   "." initial-key 257 3 8 .......;
>>> };
>>>
>>> view "internal" {
>>>         match-clients { !192.168.3.95; !192.168.3.150;
>>>                         !192.168.4.0/24; localnets;
>>>         };
>>>         sortlist {
>>>                 { 192.168.2.0/24; { 192.168.2.0/24; 192.168.3.0/24; 
>>> }; };
>>>                 { 192.168.3.0/24; { 192.168.3.0/24; 192.168.2.0/24; 
>>> }; };
>>>         };
>>>         zone "." {
>>>                 type hint;
>>>                 file "named.root";
>>>         };
>>>
>>>         zone "adi.com" {
>>>                 type master;
>>>                 file "adi.com.hosts.int";
>>>                 check-names ignore;
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                 };
>>>         };
>>>
>>>         zone "130-157.245.100.75.in-addr.arpa" {
>>>                 type master;
>>>                 file "75.100.245.130-157.revhosts";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                 };
>>>         };
>>>
>>>         zone "2.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.2.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                 };
>>>         };
>>>
>>>         zone "3.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.3.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                 };
>>>         };
>>>
>>>         zone "4.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.2.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                 };
>>>         };
>>>
>>>         zone "localhost" {
>>>                 type master;
>>>                 notify no;
>>>                 file "named.local";
>>>         };
>>>
>>>         zone "0.0.127.in-addr.arpa" {
>>>                 type master;
>>>                 notify no;
>>>                 file "named.revlocal";
>>>         };
>>>
>>>         zone "com" {
>>>                 type delegation-only;
>>>         };
>>>
>>>         zone "net" {
>>>                 type delegation-only;
>>>         };
>>> };
>>>
>>> view "internal4" {
>>>         match-clients { 192.168.4.0/24; };
>>>         zone "." {
>>>                 type hint;
>>>                 file "named.root";
>>>         };
>>>
>>>         zone "adi.com" {
>>>                 type master;
>>>                 file "adi.com.hosts.int4";
>>>                 check-names ignore;
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.4.95;
>>>                         192.168.4.150;
>>>                 };
>>>         };
>>>
>>>         zone "130-157.245.100.75.in-addr.arpa" {
>>>                 type master;
>>>                 file "75.100.245.130-157.revhosts";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.4.95;
>>>                         192.168.4.150;
>>>                 };
>>>         };
>>>
>>>         zone "2.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.2.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.4.95;
>>>                         192.168.4.150;
>>>                 };
>>>         };
>>>
>>>         zone "3.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.3.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.4.95;
>>>                         192.168.4.150;
>>>                 };
>>>         };
>>>
>>>         zone "4.168.192.in-addr.arpa" {
>>>                 type master;
>>>                 file "192.168.2.revhosts.int";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.4.95;
>>>                         192.168.4.150;
>>>                 };
>>>         };
>>>
>>>         zone "localhost" {
>>>                 type master;
>>>                 notify no;
>>>                 file "named.local";
>>>         };
>>>
>>>         zone "0.0.127.in-addr.arpa" {
>>>                 type master;
>>>                 notify no;
>>>                 file "named.revlocal";
>>>         };
>>>
>>>         zone "com" {
>>>                 type delegation-only;
>>>         };
>>>
>>>         zone "net" {
>>>                 type delegation-only;
>>>         };
>>> };
>>>
>>> view "external" {
>>>         match-clients { any; };
>>>         allow-recursion { 75.100.245.0/24; };
>>>         zone "." {
>>>                 type hint;
>>>                 file "named.root";
>>>         };
>>>
>>>         zone "adi.com" {
>>>                 type master;
>>>                 file "adi.com.hosts.ext";
>>>                 inline-signing yes;
>>>                 key-directory "dnssec";
>>>                 auto-dnssec maintain;
>>>                 also-notify {
>>>                         192.168.3.95;
>>>                         192.168.3.150;
>>>                         216.170.230.22;
>>>                 };
>>>         };
>>>
>>>         zone "130-157.245.100.75.in-addr.arpa" {
>>>                 type master;
>>>                 file "75.100.245.130-157.revhosts";
>>>                 notify explicit;
>>>                 also-notify {
>>>                         192.168.2.95;
>>>                         192.168.2.150;
>>>                         216.170.230.22;
>>>                 };
>>>         };
>>>
>>>         zone "com" {
>>>                 type delegation-only;
>>>         };
>>>
>>>         zone "net" {
>>>                 type delegation-only;
>>>         };
>>> };
>>>
>>>>
>>>> On 2014-07-17 10:39, schulz at adi.com wrote:
>>>>> We are running Bind on a Sun Sparc machine running Solairs 8. 
>>> Bind is
>>>>> built as a 32 bit executable as that is the default and is the 
>>> way
>>>>> libcrypto and libxml2 are built. We have been running Bind 
>>> 9.9.5.
>>>>> I am now trying Bind 9.9.6b1 as that claims to have fixed some 
>>> memory
>>>>> leaks.
>>>>>
>>>>> For some time now Bind has stopped being able to do recursive 
>>> queries
>>>>> every couple of weeks and I have been just restarting it. I 
>>> decided
>>>>> to
>>>>> look into this and found it logging out of memory errors. This 
>>> seems
>>>>> to
>>>>> have started happening after I set up bind to sign our domain,
>>>>> adi.com.
>>>>> The server is bluegill.adi.com. It is set up with 3 views. Two 
>>> are
>>>>> internal
>>>>> views and can do recursive queries. One is the external view and 
>>> does
>>>>> not allow recursive queries.
>>>>>
>>>>> Since restarting named, this time Bind 9.9.6b1, I have been 
>>> checking
>>>>> the memory usage every day. The usage in pages of 8192 bytes for 
>>> the
>>>>> last 7 days are:
>>>>> 16,517  19,221  20,111  23,707  24,957  26,384  28,231  29,912
>>>>>
>>>>> Note that this shows no signs of settling down. I am looking 
>>> into
>>>>> the possability of rebuilding Bind as a 64 bit executable as 
>>> that
>>>>> should take much longer to run out of memory.
>>>>>
>>>>> A recient section of the log showing that the cleaner is 
>>> running:
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 91e6a30 stats: hits=0 misses=6 queries=6 adds=6 deleted=5
>>>>> cleaned=5 cleaner_runs=140 overmem=0 overmem_nocreates=0 nomem=0
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 91e6a30 cleaning interval set to 3600.
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 933f990 stats: hits=3299 misses=79 queries=3378 adds=86
>>>>> deleted=370
>>>>> cleaned=370 cleaner_runs=144 overmem=0 overmem_nocreates=0 
>>> nomem=0
>>>>> Jul 17 10:24:44 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 933f990 cleaning interval set to 3600.
>>>>> Jul 17 10:24:46 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 9166a20 stats: hits=76514 misses=4348 queries=80862 
>>> adds=4348
>>>>> deleted=3717 cleaned=3717 cleaner_runs=144 overmem=0
>>>>> overmem_nocreates=0
>>>>> nomem=0
>>>>> Jul 17 10:24:46 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> acache 9166a20 cleaning interval set to 3600.
>>>>> Jul 17 10:29:51 bluegill named[9334]: [ID 873579 daemon.notice]
>>>>> clients-per-query decreased to 10

Tom Schulz
Applied Dynamics Intl.
schulz at adi.com


More information about the bind-users mailing list