Increase in retry and timeout errors post 9.9.4 -> 9.11.4 upgrade

Gareth Parks gparks at tripadvisor.com
Mon May 4 04:14:24 UTC 2020


I set send-cookie no; globally to test this theory out but the pattern of retries and timeout continued. Despite this I was able to determine the retries/timeouts matches the same pattern as the resolver statistic for truncated responses received which suggests they are related.


When I look at the same graph on one of the other servers it doesn't have any truncated responses but instead has a lot of NXDOMAIN errors which the upgraded server does not.


Gareth

________________________________
From: Mark Andrews <marka at isc.org>
Sent: Monday, 4 May 2020 12:13 PM
To: Gareth Parks
Cc: bind-users at lists.isc.org
Subject: Re: Increase in retry and timeout errors post 9.9.4 -> 9.11.4 upgrade

Message from External Sender

Well BIND 9.11+ supports DNS COOKIE by default and there are some servers that mishandle EDNS requests with a DNS COOKIE option present.  Unknown EDNS options are supposed to be ignored, but there are servers/firewalls that just drop such queries.  Others return FORMERR, others return NXDOMAIN when there is a answer w/o the option being present, others echo unknown options, and others still send back a DNS COOKIE response but fail to correctly copy the client cookie part to the response.

https://urldefense.proofpoint.com/v2/url?u=https-3A__ednscomp.isc.org_compliance_ts_govfull.optfail.html&d=DwIFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=YT6tAUO21wmmbZ6L3VHF95Ws6lcJb3NPmWpTtQNY9wo&m=toMCYizzDwsssH4G2tEaiaasg0S6WDJ4jIqUgj4usU4&s=cXVSwXE8RZChCdqj6Ouc5Rz07kHUdjhbu3TxhEYQ06k&e=  show how servers for .GOV zone behave when presented with a unknown EDNS option.  Other datasets are similar.

You can use "server <prefix> { send-cookie no; };” to work around known broken servers.

Mark

> On 4 May 2020, at 11:21, Gareth Parks <gparks at tripadvisor.com> wrote:
>
> Hi,
>
> I have three centos 7 servers running bind acting as internal resolvers. There was an update released that upgrades them from 0:9.9.4-74.el7_6.2 to 32:9.11.4-16.P2.el7_8.2. On performing this upgrade to one of the servers there has been a notable increase in retry and timeout errors as measured by data collected from the statistics channel. Where previously the number of errors for retry and timeouts was < 10/2 minutes I now regularly see spikes > 50/2 minutes and the error levels have remained consistent on the other two servers. When I downgrade the server back to 9.9.4 the error rate drops as well.
>
> I increased the log level for the query-errors log and observed the number of entries between the upgraded and non-upgraded servers were about the same so there doesn't appear to be an increase in errors.
>
> I'm not sure whether the issue is that I'm not looking in the correct place to identify the source of retries/timeouts or the other possibility that occurred to me is that there might have been a change between the two versions for what data is represented by those retry/timeout counters and the increased rate is not a problem but just representing different information.
>
> Gareth
>
> <OutlookEmoji-signature_2340144644a600368-9f8b-4dd9-9094-d4611542cbcc.png>_______________________________________________
> Please visit https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwIFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=YT6tAUO21wmmbZ6L3VHF95Ws6lcJb3NPmWpTtQNY9wo&m=toMCYizzDwsssH4G2tEaiaasg0S6WDJ4jIqUgj4usU4&s=P3JuggovK1bx0g_3_p1eh_KMt7kBWIf1QEqBqYe5mUk&e=  to unsubscribe from this list
>
> bind-users mailing list
> bind-users at lists.isc.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.isc.org_mailman_listinfo_bind-2Dusers&d=DwIFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=YT6tAUO21wmmbZ6L3VHF95Ws6lcJb3NPmWpTtQNY9wo&m=toMCYizzDwsssH4G2tEaiaasg0S6WDJ4jIqUgj4usU4&s=P3JuggovK1bx0g_3_p1eh_KMt7kBWIf1QEqBqYe5mUk&e=

--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742              INTERNET: marka at isc.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20200504/b7b35849/attachment-0001.htm>


More information about the bind-users mailing list