<div dir="ltr"><div>Thanks Mark, it's likely reason, they are using a microtek or such junk if my memory serves me correct, we will drop in a juniper and see if that resolves it.<br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Sep 20, 2016 at 7:51 AM, Mark Andrews <span dir="ltr"><<a href="mailto:marka@isc.org" target="_blank">marka@isc.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>

In message <CAMD-=<a href="mailto:VJSHvq%2Bg_g0sWWJ9vdxsvWDKTeSD0shOqsNkA_qozHOnQ@mail.gmail.com">VJSHvq+g_<wbr>g0sWWJ9vdxsvWDKTeSD0shOqsNkA_<wbr>qozHOnQ@mail.gmail.com</a>>, Nick Edwards writes:<br>

><br>

> Hi,<br>

><br>

> We have a customer who has their own cache server, but in the afternoons<br>

> before they close up for the day, they commit off-site backups, this<br>

> process takes them about 90 mins, anyone trying to use the internet in this<br>

> time fails 99.9% of the time due to DNS lookup errors, but if they use an<br>

> external DNS server, such as ours, it works - albeit slow but it does get a<br>

> response. The local DNS cache server operates fine and instant for their<br>

> private LAN, and pinging around their LAN is sub 1ms so the problem exists<br>

> when bind tries to go out to get answers for real hostnames. When  their<br>

> internet link is not fully utilized there is no problems.<br>

><br>

> The problem arose again today before the off-site backups when just one PC<br>

> got its message from Microsoft to grab the anniversary update, at 11<br>

> o'clock in the morning, strangely it did not fill their link, but the pps<br>

> must have been rampant because the DNS errors again failed when using their<br>

> local cache resolver server.<br>

><br>

> Is there a named.conf setting we can suggest they use on their cache server<br>

> that perseveres and waits a little longer for answers to send to their<br>

> client machines?<br>

> They are using bind 9.10.4-p2 with default settings from source package<br>

> along with options of -<br>

><br>

>         directory "/opt/named";<br>

>         allow-query { xxxxx; };<br>

>         allow-query-cache { xxxxx; };<br>

>         allow-transfer { xxxxxx; };<br>

><br>

><br>

> Thanks for any advice.<br>

> Nik<br>

<br>

</div></div>There is one word for this.  Bufferbloat.  This is where the a<br>

router has massive buffers for the link and rather than dropping<br>

packets when it cannot send packet thereby throttling TCP straight<br>

away it queues up traffic creating a very long delay path and<br>

eventually throttles TCP to the link speed when the buffer finally<br>

fills.  I've seen this create multi-second delays in the path.<br>

Really bad buffer bloat can create delays that are minutes long.<br>

<br>

Go talk to your router vendor. This is either a bug in their product<br>

or a bug in a upstream router.  It is possible to examine the traffic<br>

flows in a router and mitigate bufferbloat in another router by<br>

resticting the traffic through the first route to slightly less<br>

than what the second router will allow.<br>

<span class="HOEnZb"><font color="#888888"><br>

Mark<br>

--<br>

Mark Andrews, ISC<br>

1 Seymour St., Dundas Valley, NSW 2117, Australia<br>

PHONE: <a href="tel:%2B61%202%209871%204742" value="+61298714742">+61 2 9871 4742</a>                 INTERNET: <a href="mailto:marka@isc.org">marka@isc.org</a><br>

</font></span></blockquote></div><br></div>