Tony Finch dot at dotat.at
Thu Sep 17 22:26:03 UTC 2020

Scott Nicholas <scott.nicholas at scottn.us> wrote:
> Primary nameserver is behind a cache/proxy on enterprise network such that
> all external traffic hits this. Zone went bogus. I blame policy but on
> further inspection 2/3 proxys had differing TTL between the DNSKEY and it's

Hmm, that's suspicious. In the DNS, an RRset is an atomic unit and every
record must have the same TTL. In DNSSEC the RRSIG is part of the RRset,
so if there is a difference between the DNSKEY TTL and the RRSIG(DNSKEY)
TTL there is a bug, and it might be bad enough to cause validation

It sounds like you have a good idea of what the bug might be, and my guess
is probably the same. If we're right you will be able to provoke
validation failures by

  * query a (sacrificial!) record via the proxy with DO=0 (dig +nodnssec)
    to populate its cache with an RRset maybe lacking RRSIGs
    (that's the guess / bug)

  * change the sacrificial record on the primary

  * query again via the proxy with DO=1 (dig +dnssec) before the old TTL expires

If our guess is right, you'll get the old record with the new RRSIG and
validation will fail.

> I suspect that the signature hit the absolute time, got a fresh copy, and
> the DNSKEY stuck around another 2 days (1 week TTL). Now if the system
> wasn't security aware, I'm not sure how the TTL became unmatched but I can
> see that it could happen. I guess?


But there's another issue that can make this bug worse: I think the 7 day
TTL on your DNSKEY records is too long.

BIND's default sig-signing-interval is 30 days, and signatures are
regenerated 1/4 of the interval before expiry, i.e. 7.5 days.

If you want to avoid serving bogus signatures, you need to add together
the zone's SOA expire interval, the propagation delay between your primary
server and your public authoritative servers, and the maximum TTL of any
record in your zone. This sum must be less than the signature regeneration
interval (7.5 days by default).

In practice you will never get anywhere near the expiry interval unless
things are broken, and NOTIFY means the propagation delay is negligible.
So in the real world the important number is how good you are at
monitoring zone propagation delays and fixing things if they become
non-negligible. To allow for SNAFUs this is about the same as the
traditional zone expiry time of about a week...

The logistics are a bit different if you have a reverse proxy in your
authoritative server setup, but I hope you get the idea of how to think
about making sure your DNSSEC signatures are fresh enough.

The other interesting number is the TTL. When choosing TTLs there are
roughly two kinds of records, which I call infrastructure records and,
uuuh, I don't have a word for the others - user records? application
records? Anyway, infrastructure records are the irrelevant crap a resolver
needs in order to get the answers that users actually care about, and of
course this irrelevant crap is the tricky stuff that DNS admins have to
work with: NS records, A and AAAA records of DNS servers, DNSKEY records,
DS records.

The TTL for infrastructure records should be relatively long, to minimize
the amount of irrelevant crap that resolvers have to deal with, i.e. to
reduce the tail latency experienced by end users while resolvers go off to
look at the infrastructure. You start hitting diminishing returns for
infrastructure TTLs after about 24 hours - delegation records in TLDs
typically have TTLs of 24h or 48h, and that's a reasonable length for your
in-zone infrastructure records too.

Any longer than that and you are creating pain for yourself any time you
have to do a nameserver migration or a DNSSEC rollover. With 24h TTLs
you'll need to allow a week for a significant move; for a 7 day TTL you
might be looking at a month of faff to deal with something that's often
tricky and perhaps unexpectedly urgent.

For other records, I find an hour is a reasonable balance between decent
cache performance and not-too-annoying update delays. I don't have records
with enough churn to justify shorter TTLs but your mileage may vary.

(There are scientific measurements of DNS TTL vs latency that agree
reasonably well with my suggestions, so there's a bit more to them than
convenient round numbers!)

> A low TTL would minimize it but appliance doesn't allow direct
> configuration for DNSKEY TTL.


f.anthony.n.finch  <dot at dotat.at>  http://dotat.at/
Biscay, Fitzroy, Sole: East or northeast 4 to 6, occasionally 7 later, but
cyclonic 3 to 5 in south Fitzroy and south Biscay. Moderate or rough, but
slight in southeast Biscay, becoming rough later in Sole. Thundery showers in
Biscay and Fitzroy. Good, occasionally poor in Biscay and Fitzroy.

More information about the bind-users mailing list