Tuning suggestions for high-core-count Linux servers

MURTARI, JOHN jm5903 at att.com
Wed May 31 12:42:46 UTC 2017

	You didn't mention what OS you are using, I assume some version of Linux.  What you are seeing may not be a BIND limit, but the OS.  One thing we noted with Redhat is that the kernel just couldn't keep up with the inbound UDP packets (queue overflow).    The kernel does keep a count of dropped UDP packets; unfortunately, I can't recall the command we used to monitor.   Found this on Google, https://linux-tips.com/t/udp-packet-drops-and-packet-receive-error-difference/237 . 

	Perhaps the other folks have better details.
	Best regards!


Message: 4
Date: Wed, 31 May 2017 07:25:44 +0000
From: "Browne, Stuart" <Stuart.Browne at neustar.biz>
To: "bind-users at lists.isc.org" <bind-users at lists.isc.org>
Subject: Tuning suggestions for high-core-count Linux servers
	<07EF8B18A5248A4691E86A8E16BDBD87013BD4BE at stntexmb11.cis.neustar.com>
Content-Type: text/plain; charset="us-ascii"


I've been able to get my hands on some rather nice servers with 2 x 12 core Intel CPU's and was wondering if anybody had any decent tuning tips to get BIND to respond at a faster rate.

I'm seeing that pretty much cpu count beyond a single die doesn't get any real improvement. I understand the NUMA boundaries etc., but this hasn't been my experience on previous iterations of the Intel CPU's, at least not this dramatically. When I use more than a single die, CPU utilization continues to match the core count however throughput doesn't increase to match.

All the testing I've been doing for now (dnsperf from multiple sources for now) seems to be plateauing around 340k qps per BIND host.

Some notes:
- Primarily looking at UDP throughput here
- Intention is for high-throughput, authoritative only
- The zone files used for testing are fairly small and reside completely in-memory; no disk IO involved
- RHEL7, bind 9.10 series, iptables 'NOTRACK' firmly in place
- Current configure:

built by make with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--localstatedir=/var' '--with-libtool' '--enable-threads' '--enable-ipv6' '--with-pic' '--enable-shared' '--disable-static' '--disable-openssl-version-check' '--with-tuning=large' '--with-libxml2' '--with-libjson' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' 'LDFLAGS=-Wl,-z,relro ' 'CPPFLAGS= -DDIG_SIGCHASE -fPIC'

Things tried:
- Using 'taskset' to bind to a single CPU die and limiting BIND to '-n' cpu's doesn't improve much beyond letting BIND make its own decision
- NIC interfaces are set for TOE
- rmem & wmem changes (beyond a point) seem to do little to improve performance, mainly just make throughput more consistent

I've yet to investigate the switch throughput or tweaking (don't yet have access to it).

So, any thoughts?



Subject: Digest Footer

bind-users mailing list
bind-users at lists.isc.org


End of bind-users Digest, Vol 2662, Issue 1

More information about the bind-users mailing list