Problem with BIND 9.10.1-P1 recursion limits

Mike Hoskins (michoski) michoski at
Tue Dec 9 20:04:00 UTC 2014

Thanks for digging in so fast.  Our mitigation will be sticking to
9.9.6-P1, since we like ESV anyway.

Wanted to point out that (perhaps sadly) this isn't so crazypants...or at
least not uncommon.  The *edge* and *aka* references speak Akamai DNS+CDN.
 From my last overview, this has gotten cleaner in the latest versions of
their offerings -- but many of the large(est) sites on the Internet will
be configured this way today.

-----Original Message-----
From: Evan Hunt <each at>
Date: Tuesday, December 9, 2014 at 2:41 PM
To: Stuart Henderson <stu at>
Cc: Tony Finch <dot at>, "bind-users at"
<bind-users at>
Subject: Re: Problem with BIND 9.10.1-P1 recursion limits

>On Tue, Dec 09, 2014 at 05:51:58PM +0000, Evan Hunt wrote:
>> That's unexpected. I'll see if I can reproduce it.
>Okay, I can.
>Part of the problem is the somewhat crazypants DNS configuration
>  $ dig +noall +answer
>            3600    IN      CNAME
>  60      IN      CNAME
> 21600 IN CNAME
> 900 IN CNAME
> 20      IN      A
>... like, *wow*.  A chain of five aliases with TTLs ranging from 20
>seconds to 6 hours, passing through five different zones (,
>,,,, hosted by
>servers in three *more* zones (,, and,
>in addition to and  I had to almost
>double the maximum recursion queries to 99 to get this to work on
>an empty cache.  Yikes.
>Almost any non-empty cache will dodge the bullet. Preceeding the
>lookup of with "dig @::1 ns com" causes the query to
>succeed.  Also, as previously noted, on 9.9 it will succeed without
>a five-minute delay if you just issue the query a second time.
>So, possible workarounds if this issue is causing problems for you:
>  - Ensure that the first query sent to a newly-primed recursive
>    resolver isn't quite as spectacular as this one;
>  - Add "max-recursion-queries 100;" to your options statement;
>  - Run 9.9.6-P1 instead of 9.10.1-P1
>The five-minute delay is still a bit of a puzzle. It happens because
>of this code in adb.c:
>        /* XXXMLG Don't pound on bad servers. */
>        if (address_type == DNS_ADBFIND_INET) {
>                name->expire_v4 = ISC_MIN(name->expire_v4, now + 300);
>                name->fetch_err = FIND_ERR_FAILURE;
>                inc_stats(adb, dns_resstatscounter_gluefetchv4fail);
>        } else {
>                name->expire_v6 = ISC_MIN(name->expire_v6, now + 300);
>                name->fetch6_err = FIND_ERR_FAILURE;
>                inc_stats(adb, dns_resstatscounter_gluefetchv6fail);
>        }
>The "now + 300" bit is where the five minutes comes from.  That's code
>that's been around for years, and it is in 9.9, but apparently it's
>reached more easily in 9.10.  I'm looking into the reasons for this.
>The problem should be addressed in 9.10.2, which is likely to be
>released next month.
>Evan Hunt -- each at
>Internet Systems Consortium, Inc.
>Please visit to
>unsubscribe from this list
>bind-users mailing list
>bind-users at

More information about the bind-users mailing list