Deprecating BIND 9.18+ on Windows (or making it community improved and supported)
ondrej at isc.org
Thu May 13 17:14:29 UTC 2021
I didn’t write the email to put the blame anywhere or point fingers. I am just describing the situation.
Ondřej Surý — ISC (He/Him)
My working hours and your working hours may be different. Please do not feel obligated to reply outside your normal working hours.
> On 13. 5. 2021, at 17:29, Danny Mayer <mayer at pdmconsulting.net> wrote:
>> On 5/13/21 9:45 AM, Ondřej Surý wrote:
>> just a follow-up with a recent real life example.
>> I’ve spent few days hunting a problem on Windows that got introduced by a fix to outgoing UDP selection code. While having bugs in normal (and this was really one-liner), it’s abnormal to not have tools for debugging the problem. Here’s the (incomplete) list of things that would have to be fixed:
>> 1. Automatic crashdump collection in our CI - it should work, but it simply doesn’t and it also ignores the crashdump collection on the Hyper-V Windows Server 2016 I am using for building and debugging Windows binaries
> If you build the binaries in debug mode does it give you the crashdump collection? Hard to know without looking at the sources.
>> 2. Automatic crashdump processing - we need full backtrace printed for all the threads, both in the CI and as a “cookbook” for developers.
> What happens on Unix? If this is not out-of-the-box then you have to use the microsoft tools to do that.
>> 3. The build system rewrite - currently, the build system is this horrible hybrid of Perl that generated MSVC solution files (ninja-build or cmake would be sane alternatives)
> When the build system was written, around 2001, there were not a lot of alternatives. I was used to writing TCL but not a lot of people knew that language. Perl was the popular choice. Today that would need to be worth revisiting.
>> 4. Improvements like this: https://gitlab.isc.org/isc-projects/bind9/-/merge_requests/5020 <— the new networking stack uses libuv where we setup listening on each netmgr thread, on Windows, we currently limit this to **single thread**. This branch is an attempt to use WS2 API to make Windows work same as the rest of platforms, but it fails horribly. It’s beyond our capacity to pursue this any further.
> Why is this single-threaded? The Windows code handling the incoming and outgoing requests was always multithreaded. There was handling within the Windows code to properly deal with the threads and locking that was necessary.
>> Currently, working on Windows feels like landing on an alien planet with failing lifesupport and finding these strange large eggs in the cavern while having Sigourney Weaver on the team.
> Well I had warned that there needed to be someone on the team to properly deal with the Windows side.
More information about the bind-users