[bind10-dev] Behaviour with Duplicate RRs in Zone File

Fri Oct 7 19:50:21 UTC 2011

At Fri, 07 Oct 2011 17:08:05 +0100,
Stephen Morris <stephen at isc.org> wrote:

> What should be the behaviour of BIND 10 when a zone file is loaded
> that contains a duplicate record?
> 
> If the following zone file is loaded into BIND 10:
> 
> $ORIGIN example.
> $TTL 3600
> @	IN	SOA	ns2.example. hostmaster.example. 100 3600 900 7200 300
> 	IN	NS	ns2.example.
> 	IN	NS	ns3.example.
> 
> ns2 	IN 	A   192.0.2.254
> ns3	IN 	A   192.0.2.253
> 
> identical	IN	A	192.0.2.1
> identical	IN	A	192.0.2.1
> 
> ... querying for identical.example. returns a response with the answer
> section containing two A records.  In contrast, BIND 9.8.1 ignores
> identical records; when loaded with the above zone file, it returns an
> answer section containing one A record.

BIND 9 compacts duplicate RDATA when it loads a zone into memory (it
maintains RRs strictly in the form of "RRset").  That's why you
*normally* see only one "identical" RR.  But this compaction is a
feature of the in-memory database implementation, not of the query
processing.  Not fully checking it, but I suspect if you use BIND 9
dlz and feed it with duplicate data, BIND 9 could return the redundant
record, too (from a quick look, lib/dns/sdlz.c:dns_sdlz_putrr() simply
appends new RRs without any duplicate check, and backend drivers don't
seem to do compaction either).  I also suspect PowerDNS doesn't
suppress duplicate RRs.

> I think the BIND 10 should follow BIND 9.  If nothing else, we would
> have problems were the BIND 9 server to be a slave to BIND 10; it
> would be possible for the master and slave to issue different answers
> to the same query.

Well for this particular case, not necessarily; it depends on the
definition of "different".  (For simplicity forgetting the case where
TTLs are different) the only difference is the total number of RRs.
As a "set", they are identical.  RFC 2181 doesn't seem to say anything
about what the receiver should do in this case, but the last paragraph
of Section 5.2 seems to suggest that clients should generally be
"liberal" when they receive an (authoritative) response that does not
strictly meet the definition of RRsets.

Still, I personally think BIND 10 should follow BIND 9 (or to be
fair, even more perfect than BIND 9 because even BIND 9 wouldn't be
correct with dlz:-).  That's partly because we want to provide a
"reference implementation" that generally conforms to protocol
standards.  It would also be possible, as you guessed that sending
duplicate RRs could cause a problem.  For example, if, by some
accident, a zone contains so many number of duplicate RRs, we'd end up
falling back to TCP unnecessarily, or even worse, we may not be able
to return a full answer even in a TCP response.

Actually, I was aware that our current implementation has this
problem.  In the new data source API it should be relatively easy to
fix:  In the addOrCreate() function of lib/datasrc/database.cc we
should perform duplicate check and compaction.  This would make the
already-slow implementation even slower, but unless we completely
enforce the "captive" mode, this overhead is essentially inevitable.
We could probably mitigate the performance issue with hot spot cache
(we'd cache the compacted result), and/or, we might introduce a per
data source option to disable this compaction (an operator who is sure
that the data source is only maintained in the captive mode would
enable it).  For in-memory data source, we should of course do this
compaction at the build time, like BIND 9, to optimize query
performance.

The compaction implementation describe above should be quite
straightforward (except for the possible performance issue).  Should
we create a ticket and put it to the next sprint?

---
JINMEI, Tatuya