CNAME and other data -vs- could not find NS and/or SOA records

Fri Jun 4 03:32:57 UTC 2004

phil-news-nospam at ipal.net wrote:

>On Wed, 02 Jun 2004 17:13:19 -0400 Kevin Darcy <kcd at daimlerchrysler.com> wrote:
>
>| That's ridiculous. You want to special-case SOA and NS records? You want 
>| to selectively disable aliasing for SOA and NS? But what if I *want* to 
>| alias an SOA or NS record? What if I'm doing that today? You've just 
>
>If you have figured out how to do that today, then why not just say how
>you are doing it and we can end this thread as "problem solved".  If I'm
>to believe others here, you can't be doing that, anyway.
>
Perhaps I was not entirely clear. One uses CNAMEs to alias *names*. If 
the target of the alias happens to own an SOA record, then effectively 
you have aliased that SOA record (along with any other records it 
happens to own). For instance, www.foo.com could (if it is not a zone 
apex) be an alias for bar.com. bar.com could (and probably would, since 
it's an SLD) own an SOA record. So if anyone queried www.foo.com/SOA, 
they'd get the SOA of bar.com in the response. Voila! One has "aliased 
an SOA record". Somehow I think that's not what you're looking for. 
However, it's something people do, and the change you are proposing 
might break it. If so, you have a huge acceptance/deployment obstacle to 
overcome.

>| taken away functionality that people might be relying on. You'll never 
>| get that deployed unless you extend the protocol with some sort of 
>| "versioning" functionality so a client and server can agree to use the 
>| "Phil Howard" semantics instead of the normal ones. Good luck on that.
>| 
>| If I have the time, I might
>| generalize my patch so that it allows CNAME with any record, and answers
>| the CNAME if specific requested records are not present, or for ANY.
>| I think that will maximize the workability.
>| 
>| OK, but the stranded CNAME problems rears its ugly head again. You can't 
>| guarantee whether a given cache has the "specific requested records 
>| [...] present" or not, since any RRset can expire from the cache at any 
>| given time. I suppose you could limit your new semantics to only 
>| *authoritative* servers, but now you've created an inconsistency 
>| problem, where caching resolvers give different answers from 
>| authoritative servers for the same query. Bad juju...
>
>Since my case doesn't require the generalization, then we can drop that idea.
>That leaves the CNAME record as the only record in the zone, and we're aliasing
>even the SOA and NS records as you suggested above that you might want to use.
>So, if the authoritative server, for any query, always answers with a CNAME
>pointing to the target domain to query, what breaks?
>
Phil, here's a short quiz for you: the record "foo.com IN CNAME bar.com" 
exists, and, under your proposal, foo.com also owns SOA and NS records. 
bar.com also owns SOA and NS records. A query comes in for foo.com/NS. 
What NS set is returned?

A) The NS set for foo.com only
B) The NS set for bar.com only
C) A combination of both sets

If your answer is (A), then I assume it's because the nameservers 
involved are special-casing QTYPEs of NS and SOA, not following aliases 
for them as they normally would. This of course requires a modification 
to the resolution algorithm of every authoritative and caching 
nameserver in existence, the logistics of which, as Jim Reid pointed 
out, are quite daunting. But let's say you pull that off somehow, and 
every nameserver and caching server is now running with the modified 
Phil Howard algorithm. But you've still changed how DNS behaves even 
when viewed from DNS *clients*, and this could easily lead to more weird 
and wonderful forms of breakage. For a given DNS name sometimes aliasing 
occurs, sometimes it does not; it all depends on the QTYPE, to which 
some clients may not be paying close attention. And what about 
QTYPE=AXFR queries? I guess they need to be a special-cased too. And 
QTYPE=* queries? Do you answer (A), (B) or (C) for those? If you answer 
(A) or (B), then you upend the long-settled expectation of DNS clients 
that the results of a QTYPE=* query equals an amalgamation of what you 
would get by querying each record type separately (at least when the 
queries are made directly to an authoritative server; don't get me 
started again on the whole QTYPE=* caching debate). If you answer (C) 
for QTYPE=* queries (i.e. combine the results of both foo.com and 
bar.com), then not only do you cause all of the problems I mention below 
for the (C) answer (e.g. increasing workload and points of failure), but 
you're not even being consistent within your own approach, i.e. aliasing 
is disabled for SOA/NS/AXFR queries, enabled for all others, except for 
QTYPE=*, where a third methodology -- amalgamation -- is used. What a 
mish-mosh!

Lastly, for an (A) answer to the original quiz, consider the nasty 
Dynamic Update interaction. A Dynamic Update client wants to add a 
foo.com SRV record, say, where none exists previously. It goes through 
the normal SOA/NS queries to identify the zone and zone master, and, 
according to what it is told, believes that foo.com is the zone in which 
such an update should be made. Goes to make the update, but it fails. 
Why? Because even though SOA and NS can, under your scaled-down 
proposal, co-exist with the foo.com CNAME, the new SRV record *cannot*. 
The client goes into error recovery, issuing an SRV query of foo.com to 
see if it exists (maybe it already checked that, but who knows?, someone 
else may have added it in the last few milliseconds). Lo and behold, 
foo.com is actually a CNAME to a name in a previously-unheard-of zone! 
You need to go over _there_, knucklehead, to make the update to a 
different zone and probably a different master server, unless of course 
the Dynamic Update client is so confused at this point, it's sitting in 
the corner and sobbing. Sure, you could say it's the client's fault -- 
it should have put a "foo.com NXRRSET CNAME" prereq in its first update 
so that it would fail quickly and cleanly and unambiguously. But that's 
not something it has to do today when updating a zone apex, so again, 
you're imposing a change in behavior not only on the *infrastructure* 
components of DNS, but even the *clients* of DNS, and who even knows how 
many different types of Dynamic Update and other DNS-interacting clients 
there are out there, which would all have to change?

If your answer to the original quiz is (B), you've basically made the 
foo.com NS records "invisible". Ditto for the SOA record, since your 
proposal covers both record types. They're just dead weight: no-one sees 
them except perhaps in an answer to a QTYPE=AXFR or QTYPE=* query. Good 
luck trying to, say, Dynamically Update a zone with invisible SOA and NS 
records. Or for a slave to perform a serial-number check. At that point, 
one has to ask, why even have a foo.com zone? A zone with invisible SOA 
and NS records is not a zone in any meaningful sense of the term. Why 
not just put a foo.com CNAME record (and any records you want to put 
*under* foo.com) into the .com zone? From a protocol standpoint, that's 
exactly the way to handle it. It's just that the TLD operators don't 
currently allow us ordinary folks to put such records in the zones. It's 
basically an administrative/political issue, then, *not* a technical 
one. The protocol allows you to do what you want, but the TLD operators 
forbid it. Life is tough. Perhaps you should petition to start your own TLD.

If your answer to the original quiz is (C), you have roughly doubled the 
workload of servers everywhere (since they have to look at two zones and 
combine the results, instead of just looking at one), and introduced 
more possible points of failure for queries. Moreover, these 
amalgamated/conglomerated answers will just confuse resolvers and/or 
clients everywhere, which aren't expecting to get multiple RRsets in 
response to a non-"meta" query (by non-"meta", I mean a QTYPE that 
refers to a specific record type, as opposed to being QTYPE=*, 
QTYPE=AXFR or something like that). Even if the resolver doesn't 
consider such a response outright corrupt, you've raised a nasty 
ambiguity over which set of records is the "real" set. Again, the 
contents of SOA and NS records *matter* to things like zone-transfer and 
Dynamic Update; they're not just "placeholders" at the top of a zone, as 
you may be accustomed to thinking of them. Significant breakage ensues.

So, bottom line, depending on your answer to the quiz, you cause 
breakage and/or the *real* solution to your problem is an administrative 
rather than a technical one.

                                                                  - Kevin

P.S. You've asked how and why "CNAME and other data" "worked" back in 
the Bad Old Days when BIND allowed it. I think it worked 
*inconsistently*: depending on the order of records in the zone file, 
named would either consider the name to be a CNAME exclusively or the 
CNAME would be "hidden" behind the other records. So, as you maintain 
your zonefile, you could get inconsistent results from the same master 
server if you ever rearranged the order of records. Even if you didn't, 
since zone transfers (if they worked at all!) aren't required to 
preserve record order, your slaves could respond inconsistently with 
your master or between themselves. Caching nameservers then would be at 
the mercy of whatever authoritative server they talked to last and for 
which they still had cache entries, so they answered inconsistently too. 
Is this really a state of affairs to which you wish to return?