lookout timesouts

Mark Andrews marka at isc.org
Mon Sep 19 21:51:38 UTC 2016


In message <CAMD-=VJSHvq+g_g0sWWJ9vdxsvWDKTeSD0shOqsNkA_qozHOnQ at mail.gmail.com>, Nick Edwards writes:
> 
> Hi,
> 
> We have a customer who has their own cache server, but in the afternoons
> before they close up for the day, they commit off-site backups, this
> process takes them about 90 mins, anyone trying to use the internet in this
> time fails 99.9% of the time due to DNS lookup errors, but if they use an
> external DNS server, such as ours, it works - albeit slow but it does get a
> response. The local DNS cache server operates fine and instant for their
> private LAN, and pinging around their LAN is sub 1ms so the problem exists
> when bind tries to go out to get answers for real hostnames. When  their
> internet link is not fully utilized there is no problems.
> 
> The problem arose again today before the off-site backups when just one PC
> got its message from Microsoft to grab the anniversary update, at 11
> o'clock in the morning, strangely it did not fill their link, but the pps
> must have been rampant because the DNS errors again failed when using their
> local cache resolver server.
> 
> Is there a named.conf setting we can suggest they use on their cache server
> that perseveres and waits a little longer for answers to send to their
> client machines?
> They are using bind 9.10.4-p2 with default settings from source package
> along with options of -
> 
>         directory "/opt/named";
>         allow-query { xxxxx; };
>         allow-query-cache { xxxxx; };
>         allow-transfer { xxxxxx; };
> 
> 
> Thanks for any advice.
> Nik

There is one word for this.  Bufferbloat.  This is where the a
router has massive buffers for the link and rather than dropping
packets when it cannot send packet thereby throttling TCP straight
away it queues up traffic creating a very long delay path and
eventually throttles TCP to the link speed when the buffer finally
fills.  I've seen this create multi-second delays in the path.
Really bad buffer bloat can create delays that are minutes long.

Go talk to your router vendor. This is either a bug in their product
or a bug in a upstream router.  It is possible to examine the traffic
flows in a router and mitigate bufferbloat in another router by
resticting the traffic through the first route to slightly less
than what the second router will allow.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka at isc.org


More information about the bind-users mailing list