DNS Capacity issue help -- Recursive Query -- it seems some packets are dropped by DNS
jp2111 at att.com
Tue Apr 10 00:37:08 UTC 2018
I did recursive query capacity test. I used traffic generator to place 15K QPS traffic to DNS 1 with FQDN1 (Note, FQDN1 can't be resolve by DNS1, it need to forward it to DNS2 and TTL is set to 0)
But during the test , I found lots of failure , the successful rate is not high (85%). Then I used TCPdump commands to capture logs in DNS1 , I found the following things:
Thing 1. DNS query number is larger than response number between traffic generator and DNS1 . About 15% traffic are dropped by DNS1 .
Thing 2. DNS recursive query number between DNS1 and DNS2 is far less than query number between traffic generator and DNS1
I want to confirm DNS behavior here:
DNS1 will initiate a recursive query towards DNS2 when first query is coming . transaction time between DNS1 and DNS2 is about 3 miliseconds. If in these 3 miliseconds, there are other queries with same FQDN are coming, whether all these queries will be lined up in DNS1 because DNS1 has initiated the same FQDN resolve request to DNS2 ? if yes, which will explain thing 2 I observed during the test. After DNS1 gets response from DNS2, then DNS1 will send response to the all the requests from traffic generator lined up in DNS1 , but unfortunately , DNS1 seems drop some packets here. There are 15% packet without response .
Besides, CPU usage is not high in DNS1 , only 30%
Is my understanding correct ? Which parameters in DNS will impact the performance significantly ? How to do further troubleshooting ?
Thank you very much!!
More information about the bind-users