how to debug "no more recursive clients"

Len Conrad LConrad at
Wed Feb 8 15:18:59 UTC 2006

We have "forwarding DNS" setup, where all the internal DNS "forward 
first" to our recursive-only rns1 and rns2 (recursive ns).

rns1's named process occupies an average of about 170 - 200 MB of 
memory footprint.

rns's are running BIND 9.3.1, yesterday upped to 9.3.2 on fbsd 4.x.

BIND is setup with:

blackhole { !<mynetworks>; };

On two occasions, rns1 has apparently stopped resolving recursive 
queries, with syslog showing many lines of:

no more recursive clients: quota reached

"rndc status" reports the quota to be 1000 clients.

We have been running rndc status every 5 minutes for several days and 
logging the recursive clients line.  Avg is only 58/1000 recursive 
clients, with a handful in the 200 - 400 range.

We also turned on BIND logging and found rns1 sending an average of 
30 createfetch/second over several business hours.

When rns1 quit resolving those two times in a period of about 8 weeks 
of operation, rebooting the machine and stop/start bind solved the problem.

We figure that some upstream network problem could cause BIND's 
createfetch to pile up a lot of timing-out queries in a hurry, but 
our network monitors showed no networking problems in the two periods 
where BIND quit resolving.

What is BIND9's behavior when it hits "recursive clients quota"?  For 
us, it seems to be that it dies, requiring named to be stop/start.

Any suggestions beyond the above for debugging this problem?


