Massive increase of SERVFAIL after April 28th 2025.

vincent at cojot.name vincent at cojot.name
Thu May 1 16:07:21 UTC 2025


Hi Carlos,

First of all, I'd like to say how sorry I was for those affected, as I was 
watching the events unfold down south.

I've rebuilt dnstracer for RHEL9 and I don't really understand what's 
going on here.. Here's the output for ftp.lip6.fr:

# dnstracer -q cname -s M.GTLD-SERVERS.NET  ftp.lip6.fr
Tracing to ftp.lip6.fr[cname] via M.GTLD-SERVERS.NET, maximum of 3 retries
M.GTLD-SERVERS.NET (2001:0501:b1f9:0000:0000:0000:0000:0030) Refers backwards

Same output from any of my bind hosts:

# dnstracer -q cname -s 127.0.01  ftp.lip6.fr
Tracing to ftp.lip6.fr[cname] via 127.0.01, maximum of 3 retries
127.0.01 (127.0.0.1) Refers backwards

But interestingly, doing this with www.google.com instead of ftp.lip6.fr 
-only- works on the bind servers with forwarders configured. On a test 
bind host without the forwarders, I get this:

# dnstracer -q cname -s 127.0.01  www.google.com
Tracing to www.google.com[cname] via 127.0.01, maximum of 3 retries
127.0.01 (127.0.0.1) Refers backwards

Vincent

On Thu, 1 May 2025, Carlos Horowicz via bind-users wrote:

> 
> Hi,
> 
> For SERVFAIL to happen, ALL authoritative for the affected domains must have been in Datacenters in Spain, Portugal or southern France.
> 
> I live in Spain, and as 12:33 CET I lost not only power but basic telephony, cellular telephony and cellular data. Everything. Power generators were only good for keeping power
> locally at Datacenters or Hospitals, but they were isolated from each other.
> 
> The mitigation began at around 2-3pm CET , as they were turning up different power plants one at a time and connecting it to the power network, and it took them more than 12
> hours to turn everything up.
> 
> So may be that was the reason, if it coincides with your perception ... dnstracer has eventually helped me find lame delegations.
> 
> Carlos Horowicz
> Planisys
> 
> On 01/05/2025 17:23, Rob McEwen via bind-users wrote:
>       From vincent at cojot.name
>       until a few days ago (April 28th?) when the amount of SERVFAIL started going ballistic and started preventing the resolution of a lot of DNS names on the
>       internet to the point where DNS was unusable
> 
> 
> I strongly suspect that this was caused (even if indirectly?) by the MASSIVE and many-hours-long power outages in Europe, mainly in Spain and Portugal. That started on
> April 28, 2025, at approximately 6:33 a.m. Eastern Time (ET) - and the majority of it lasted almot 24 hours.
> 
> https://www.france24.com/en/europe/20250430-what-we-know-so-far-about-the-massive-blackout-that-hit-spain-and-portugal
> 
> Hopefully, you're not seeing any more of these errors now?
> 
> Rob McEwen, invaluement
> 
> 
> 
>


More information about the bind-users mailing list