[bind10-dev] Suboptimal way of splitting XFR-out to messages

Fri Dec 2 19:08:37 UTC 2011

At Thu, 1 Dec 2011 10:41:03 +0100,
Michal 'vorner' Vaner <michal.vaner at nic.cz> wrote:

> During review of #1389, it was discovered, that xfrout splits the stream into
> messages in an suboptimal way. Currently, it keeps a sum of RR sizes and if this
> overflows, it renders the message and starts a new one.

To be precise, it's not a newly discovered issue.  At least some of us
were aware of it when it was first written.

> But, the rendering compresses names. This means few more RRs could fit in, if we
> are lucky and there are lots of similar names (which there usually are, at last
> the names of the RRs, if not inside the real data).
> 
> This is not really a big deal, it doesn't break anything. But it generates more
> messages and can increase the size of the whole stream, as we can't compress
> names across message boundary. Having less messages mean having less boundaries
> and more compression.

Personally I'm not convinced it's worth solving.  Even though names
cannot be compressed between different messages, in practice they
should be well compressed within their own messages.  So, roughly
speaking the additional overhead is the size of the DNS header part
(12 bytes each) that would not be needed if we made every message
really full.  If, hypothetically, a "suboptimal" message still
contained 65000 (instead of 65535) bytes of data, the overhead of the
header part is just 0.18%.  While the number of total messages also
matters in terms of processing overhead (in general handling two
messages would be more heavyweight than that for one message, etc), I
suspect it's still marginal in the total workload for the entire XFR
session.

So, my gut feeling is that unless the overhead of making it "optimal"
is itself lightweight, it's not worthwhile.  One approach in this
sense would be to extend the Message class interface so that we can
allow "incremental rendering" as we add RRs.  That wouldn't be
difficult to implement, but I personally don't believe the additional
complexity of the class interface is justified if the only conceivable
(and realistic) use case is this one.  Another approach is to do it a
bit harder within xfrout, e.g., if the first name of a message is a
subdomain of the zone's origin (which should normally be the case)
take it into account that at least the origin part of all subsequent
names will be compressed.  This won't provide fully optimal result,
but doesn't require expensive efforts like two-pass rendering or
extension to the Message interface.

In any case, if we wanted to pursue this issue further, I'd first
suggest expected gain using some real-world examples.  Since the
solution cannot be a few lines of code anyway, we should do it only
when we really know it quantitatively and substantially improves the
result.

BTW, as a FYI, BIND 9 basically behaves as BIND 10 currently does (I
guess the current implementation was derived from the BIND 9
implementation).  From a quick look NSD does "incremental rendering"
and produces "optimal" messages.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.