Sporadic Timeouts after upgrading to bind9.20

Klaus Darilion klaus.darilion at nic.at
Thu Dec 5 23:28:11 UTC 2024


Hi Ondřej!

I can test also the development branch. I prefer deb packages (do you have nightly builds?), but I can fallback to make&&make install

Regards
KLaus


From: Ondřej Surý <ondrej at isc.org>
Sent: Thursday, December 5, 2024 8:36 PM
To: Klaus Darilion <klaus.darilion at nic.at>
Cc: Klaus Darilion via bind-users <bind-users at lists.isc.org>
Subject: Re: Sporadic Timeouts after upgrading to bind9.20

Hi Klaus,

we've identified an issue in the glue cache that have been causing drops in the performance.

Can you test a development branch or do you need fix on top of 9.20?

Ondrej
--
Ondřej Surý (He/Him)
ondrej at isc.org<mailto:ondrej at isc.org>

My working hours and your working hours may be different. Please do not feel obligated to reply outside your normal working hours.


On 9. 9. 2024, at 10:39, Klaus Darilion via bind-users <bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>> wrote:

As we still have several timeouts I downgraded our server to 9.18. If you know another workaround or need someone to test new version please let me know.

Thanks
Klaus

From: Klaus Darilion <klaus.darilion at nic.at<mailto:klaus.darilion at nic.at>>
Sent: Saturday, September 7, 2024 12:36 AM
To: Klaus Darilion <klaus.darilion at nic.at<mailto:klaus.darilion at nic.at>>; Ondřej Surý <ondrej at isc.org<mailto:ondrej at isc.org>>
Cc: Klaus Darilion via bind-users <bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>>
Subject: RE: Sporadic Timeouts after upgrading to bind9.20

Correcting myself: event with { reuseport no; };  and UV_THREADPOOL_SIZE=12 still timeouts happen, but the situation improved a lot.
Regards
Klaus

From: bind-users <bind-users-bounces at lists.isc.org<mailto:bind-users-bounces at lists.isc.org>> On Behalf Of Klaus Darilion via bind-users
Sent: Saturday, September 7, 2024 12:21 AM
To: Ondřej Surý <ondrej at isc.org<mailto:ondrej at isc.org>>
Cc: Klaus Darilion via bind-users <bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>>
Subject: RE: Sporadic Timeouts after upgrading to bind9.20


From: Ondřej Surý <ondrej at isc.org<mailto:ondrej at isc.org>>
Sent: Friday, September 6, 2024 4:08 PM
To: Klaus Darilion <klaus.darilion at nic.at<mailto:klaus.darilion at nic.at>>
Cc: Petr Špaček <pspacek at isc.org<mailto:pspacek at isc.org>>; bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>; Klaus Darilion via bind-users <bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>>
Subject: Re: Sporadic Timeouts after upgrading to bind9.20

Are your running with options { reuseport no; };  ?

You might want to try that.

After setting reuseport no; (and UV_THREADPOOL_SIZE=12) I have not seen any timeouts anymore.

Anyway, this:

TID 8917:
#0  0x00007b385aa6daa9     cds_lfht_destroy - /usr/lib/x86_64-linux-gnu/liburcu-cds.so.8.1.0

caught my eye. Are the zones you are hosting particularly large on GLUE?

I don’T know and I have not checked yet. One of the affected zones is .ch.  You could download the zone fromhttps://zonedata.switch.ch/ And they are using NSEC (not NSEC3 as I have written before)



Also if you have more eu-stack, can you confirm this is the pattern now?

After setting reuseport no; I do not have stack-traces any more. But if that would help you I can undo the workaround next week to collect traces.

Thanks
Klaus


--
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users at lists.isc.org<mailto:bind-users at lists.isc.org>
https://lists.isc.org/mailman/listinfo/bind-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20241205/7cca1f08/attachment-0001.htm>


More information about the bind-users mailing list