Fwd: Re: recursive resolver

Fred Morris m3047 at m3047.net
Thu Mar 12 17:00:31 UTC 2020


To confirm, this is a local caching also-known-as recursive resolver. It 
is quick (< 100 msec) when answering from cache, but not when it has to do 
lookups itself (> 1000 msec).

On Thu, 12 Mar 2020, ShubhamGoyal wrote:
>
> we made a recurive resolver (Cent OS 7,  8GB RAM ,250 GB Hard disk and network
> speed is also good  ) . It reply in 1200 msec and 1800 msec (which is very
> slow). if it gave Reply by Cache (80 msec or 76 msec).
> so i want to know about,
> How can i improve my recursive resolver speed.

I can't give you a detailed troubleshooting guide, but I can give you some 
general outline of the problem terrain.

The obvious conclusion (until disproved!) is that "DNS lookups to the rest 
of the world are slow" but I wouldn't start there. I'd start with looking 
at the BIND logs, because it's easy.

I'd start with setting up logging like the following:

     // Must start named with -d 2 for this to be activated,
     // otherwise it's just silent.
     channel queryerrors {
         file "bind-query-errors.log" versions 2 size 20m;
         severity debug 2;
         print-category no;
         print-severity yes;
         print-time yes;
     };

and then I'd look in bind-query-errors.log for entries like this:

27-Jan-2019 11:00:54.185 debug 2: fetch completed at resolver.c:4176 for 
addons.cdn.mozilla.net/A in 10.000425: timed out/success 
[domain:mozilla.net,referral:0,restart:4,qrysent:13,timeout:12,lame:0,quota:0,neterr:0,badresp:0,adberr:0,find 
fail:0,valfail:0]

Don't panic about a few errors, but if you're having problems, that's 
where I'd look. ;-)

There are a number of different kinds of errors, this one is "timed out". 
(Do you see timeouts or query fails at your caching server's clients (your 
workstation / laptop))? Can you confirm or disprove the "obvious 
conclusion" from data in the logs? Is some other issue apparent?

Moving back to the "obvious conclusion", your workstation makes a request 
to your server with the "RD" (recursion desired) flag. Your server then 
makes requests of its own without the "RD" flag. You should be able to see 
these queries (and the responses) directed to nameservers on the internet 
by dumping packets, and to pair them up and see how long they're taking. 
You can even make your own with dig using the appropriate flags.

>From here you have to explore whether it's a technical connectivity issue 
(such as MTU or blocking of TCP etc.) or provisioning / bandwidth issue 
(just too slow / too many hops to anywhere or some (particular) where).

After you've ruled out the obvious conclusion you have to start 
considering scenarios such as someone intentionally interfering in path 
with port 53 traffic.

--

Fred Morris



More information about the bind-users mailing list