Recursive bind becomes unresponsive with high load
Mike.Mitchell at sas.com
Fri Apr 1 21:48:01 UTC 2016
Have you checked the Kernel's connection tracking statistics?
Here's a link:
I've had to increase some network parameters on our busy nameservers. I put the following in /etc/sysctl.conf
net.netfilter.nf_conntrack_udp_timeout_stream = 45
net.nf_conntrack_max = 500000
net.ipv4.neigh.default.gc_thresh1 = 512
net.ipv4.neigh.default.gc_thresh2 = 1024
net.ipv4.neigh.default.gc_thresh3 = 2048
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_tw_recycle = 1
From: bind-users-bounces at lists.isc.org <bind-users-bounces at lists.isc.org> on behalf of Michael Brunnbauer <brunni at netestate.de>
Sent: Friday, April 1, 2016 12:29 PM
To: Mathew Ian Eis
Cc: dot at dotat.at; bind-users at lists.isc.org
Subject: Re: Recursive bind becomes unresponsive with high load
On Fri, Apr 01, 2016 at 04:01:04PM +0000, Mathew Ian Eis wrote:
> What OS are you running your BIND server on? Is it virtualized?
Linux Kernel 3.4.111 with glibc 2.22, 32bit, not virtualized. No distribution -
everything was compiled by hand.
> Is it fully unresponsive, or could it be simply taking longer to respond than your client timeout?
Assuming that bind would report dropped queries, I guess it is the latter.
Regarding the suggestion made by Tony Finch about too many TCP connections
in the TIME_WAIT status: That would have been a good explanation. But I do not
see more than 200 TCP connections in TIME_WAIT status when the problem occurs
and not more than 5000 TCP/UDP connections with port 53.
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener Straße 11a
++ 81379 München
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ E-Mail brunni at netestate.de
++ Sitz: München, HRB Nr.142452 (Handelsregister B München)
++ USt-IdNr. DE221033342
++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
More information about the bind-users