Options for timeout in Bind/DNS

Thu Jul 2 15:37:24 UTC 2009

TPZ wrote:
> Dnia 1-07-2009 o godz. 19:02 Kevin Darcy napisał(a):
>   
>> TPZ wrote:
>>     
>>> Are there any options for Bind to configure timeout for DNS requests?
>>>
>>>
>>>   
>>>       
>> Short answer: not as far as I know.
>>
>> Hopefully you understand that it's the DNS clients, and not BIND itself,
>> that implement the main timeout/retry strategy for a DNS query
>> transaction. Send a request, if it times out, try another resolver in
>> the resolver list, or retry the request. The main determinant of whether
>> a request succeeds or fails, therefore, in the face of slow or
>> unavailable upstream nameservers, is the client resolver's
>> configuration, not BIND's. The applications which call the resolver
>> routines on those clients, may also have their own timeout values, which
>> can sometimes be significantly shorter than what is set in the client
>> resolver. Therefore they will timeout the lookup request before the
>> client resolver would have abandoned it.
>>
>> BIND itself will time out and fail over quickly -- on the order of
>> milliseconds -- between the nameservers it talks to. It will also keep
>> track, in its cache, of what nameservers are responding slowly, or which
>> are giving invalid answers, and de-prioritize or avoid nameservers in
>> those respective categories. So, in practical terms, it is "self-tuning"
>> over time, and in a way that is far more sophisticated than setting a
>> single "timeout" value could ever hope to accomplish.
>>
>> - Kevin
>>
>> _______________________________________________
>> bind-users mailing list
>> bind-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/bind-users
>>     
>
> Thank you for explain.
> I experienced some problems with i.e. SSH, because there is an option 
> set by default #UseDNS yes. And when DNS is not available (ie via 
> /etc/resolv.conf) it is about 30 seconds timeout. Even if you connect 
> like this "ssh user at 192.168.100.12" (without DNS domain name) 
> application trying contact DNS.
> Are you agree with me?
>   
That has very little to do with the timeouts set in BIND itself. You 
need to have multiple servers in /etc/resolv.conf, and an intelligent 
failover strategy. Typically, we put 2 local servers in /etc/resolv.conf 
(to deal with a single server failure at that location, the most common 
case), followed by an off-site server as the third entry (to deal with 
the less-common situation of the server farm being isolated from 
wherever the client is, at that location). You have to consider the 
failure scenarios, their relative likelihood, and always bear in mind 
that the further down the resolver list you go, the more likely the app 
will have already failed before you get there. Having 10 entries in 
/etc/resolv.conf, for instance, while it may give the superficial 
appearance of bullet-proofing your nameserver against any failure 
scenario, may actually add little or no value over 3 or 4 entries. And 
it would probably be somewhat harder to keep up to date, if you 
re-address nameservers frequently or semi-frequently, as we do.

Another approach to consider, if you have such facilities available, or 
can afford to build new infrastructure, is to put your nameservers 
behind hardware load-balancers. Or use the "anycast" approach as the 
Internet root and TLD nameservers do (I don't have any personal 
experience with "anycast", and I know that that approach has its 
detractors, I'm just throwing it out there as an idea to consider).

If your nameservers and/or network are unreliable, then you might 
consider fiddling with the timeout/retry parameters in /etc/resolv.conf, 
if the resolver library on your platform actually recognizes those 
options (check your man page for /etc/resolv.conf, or the equivalent 
documentation for your platform). But, again, this has very little to do 
with the configuration of BIND itself, and, ultimately, if your 
name-resolution architecture is unreliable, you're going to continue to 
have problems like this, and ssh is only going to be one the victims, 
albeit the one which might be most visible to you as a system administrator.

                                             - Kevin