Increase in retry and timeout errors post 9.9.4 -> 9.11.4 upgrade
gparks at tripadvisor.com
Mon May 4 01:21:43 UTC 2020
I have three centos 7 servers running bind acting as internal resolvers. There was an update released that upgrades them from 0:9.9.4-74.el7_6.2 to 32:9.11.4-16.P2.el7_8.2. On performing this upgrade to one of the servers there has been a notable increase in retry and timeout errors as measured by data collected from the statistics channel. Where previously the number of errors for retry and timeouts was < 10/2 minutes I now regularly see spikes > 50/2 minutes and the error levels have remained consistent on the other two servers. When I downgrade the server back to 9.9.4 the error rate drops as well.
I increased the log level for the query-errors log and observed the number of entries between the upgraded and non-upgraded servers were about the same so there doesn't appear to be an increase in errors.
I'm not sure whether the issue is that I'm not looking in the correct place to identify the source of retries/timeouts or the other possibility that occurred to me is that there might have been a change between the two versions for what data is represented by those retry/timeout counters and the increased rate is not a problem but just representing different information.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 1464 bytes
More information about the bind-users