[bind10-dev] DNS packet API: the name object

Wed Aug 12 13:35:52 UTC 2009

On Tue, 2009-08-11 at 16:13 -0700, JINMEI Tatuya / 神明達哉 wrote:
> At Mon, 10 Aug 2009 19:31:56 +0200,
> Shane Kerr <shane at isc.org> wrote:
> 
> > The is_absolute() method seems like it will complicate things.
> > 
> > Do we really need to handle non-absolute names? Perhaps it is better to
> > leave the absolute/relative name handling to a different class?

> Maybe we can begin with absolute names only and see if we really need
> to support non-absolute ones as part of the API.

Sounds like a plan.

> > If we do want to handle non-absolute names, do we need a way to create a
> > context which says what the current domain is (or the current search
> > list is)? In that case it would be something like:
> > 
> > 	explicit Name(const string &namestr, const SearchList *sl = NULL);
> 
> or passing an "origin" to make a non-absolute (relative) name
> absolute.  It seems to be the convention of dnspython.

I guess it depends on whether we want something like $ORIGIN in a zone
file, or whether we want something like "search" in /etc/resolv.conf. In
any case, it makes me happy we are going to do it in a separate
class. :)

> > Regarding the comment in the code about a "from-wire" constructor...
> > there are two different levels I can see this at:
> > 
> >      1. Something that takes a simple length/data/length/data/'\0' DNS
> >         encoded name and uses that for constructing.
> >      2. Something that takes a data blob (presumably a DNS packet) with
> >         DNS compression and everything, plus a pointer into that blob,
> >         then builds a DNS name out of that.
> > 
> > The first is straightforward, but probably less useful (well, useful for
> > an authoritative-only server perhaps!). The second is... trickier,
> > right? I see the value in it, but maybe it makes more sense to provide a
> > helper function that takes a vector<> of strings, each of which is a
> > label. We could keep the actual concerns about boundary checking and
> > such in their proper place (a packet-parsing class), but still have a
> > relatively efficient way to build a Name object.
> 
> FWIW, both dnspython and BIND9 libdns take the second approach: the
> "from wire" function takes a DNS message with a kind of "offset" and
> constructs a name, possibly decompressing labels.
> 
> But this interface may be too "low level", and we may have to consider
> a higher abstraction (if that's what you mean above).  This will be a
> TODO item of this topic.

Okay, that's fine.

> > Speaking of benchmarking... do we have performance targets for this
> > class? If we intend to use it in packet processing, we can do some
> > guesses as to what is acceptable based on that.
> > 
> > So, for example, if we assume an authoritative-only server needs to
> > answer 100K queries a second on our as-yet-undefined target, and it
> > creates an average of 10 names, with 50 comparisons, then we would want
> > to be able to do something like 10M creations and 50M comparisons a
> > second on that machine if we can afford to spend 20% of our time doing
> > name processing. I have no idea if these numbers make sense, of
> > course. :)
> 
> Maybe we can begin with microbenchmark for basic operations such as
> from-wire, comparison, concatenation, split, etc, against BIND9 libdns
> and the BIND10 implementation.  Whether we need system-wide benchmark
> for this module (class) would depend on whether these operations are
> major bottleneck in the actual message processing.

Hm... shall we set the BIND 9 speed as a minimum or not even bother with
that?

I ask because this is kind of opposite of the idea that Michael is
pushing, which is that we set benchmarks in advance.

> > > - the fullcompare() method is similar to libdns's
> > >   dns_name_fullcompare(), but returns an abstract "result" object,
> > >   encapsulating the comparison result, ordering as an integer, and the
> > >   number of common labels.  dns_name_fullcompare() took pointers to
> > >   integers as arguments for the latter two, but I believe returning a
> > >   single result is more intuitive for the API user.
> > 
> > Makes sense. Do we need functions that do less, for example an explicit
> > is_subdomain_of() function? Also, I'm not really clear what the various
> 
> Perhaps.  dnspython has is_subdomain(), and it's actually implemented
> using fullcompare().  In that sense this is a possibly-useful wrapper,
> rather than a mandatory primitive, so I omitted it in the initial
> proposal.

The thing is that fullcompare() provides information which takes
non-zero time to discover. If you don't need this, then a useful wrapper
will be slower than an actual implemntation of is_subdomain(). I guess
we can leave it out, and then worry about how to implement it if we use
it (but this seems like something that will be immediately useful).

> > name relations are (I don't know what "contains" is compared to
> > "subdomain" for example).
> 
> These are derived from the convention of BIND9's libdns:
> 
>  *\li	dns_namereln_none		There's no hierarchical relationship
>  *					between name1 and name2.
>  *\li	dns_namereln_contains		name1 properly contains name2; i.e.
>  *					name2 is a proper subdomain of name1.
>  *\li	dns_namereln_subdomain		name1 is a proper subdomain of name2.
>  *\li	dns_namereln_equal		name1 and name2 are equal.
>  *\li	dns_namereln_commonancestor	name1 and name2 share a common
>  *					ancestor.
> 
> dnspython uses almost the same notion of comparison result (it uses
> "superdomain" and "subdomain", instead of "contains" and "subdomain",
> which would be more intuitive).  We don't necessarily have to follow
> the same convention, though.

Ah, okay. These all make sense... perhaps 

> > BTW, we need a NameTooLong exception, right?
> 
> Perhaps.  In the current prototype implementation it triggers a more
> generic "NoSpace" exception.

To me "NoSpace" seems like "Out of memory" to me. We need to make the
exceptions a bit clearer.

In fact, looking through the code makes me worry. I can feel the BIND 9
coming though, because I have no idea what is going on. ;)

Variable declaration section:

        char c;
        ft_state state;
        unsigned int value, count, pos, lpos;
        unsigned int n1, n2, tlen, nrem, nused, digits, labels, tused;
        bool done;

While I know the tradition is to never add a comment that could be
avoided, I think we should perhaps break with tradition and explain what
these are and why they are going to be used. Likewise the state machine
has no description of how or why it is doing what it is doing. For
example, there is some special handling for '@' characters, which I
haven't figured out yet, but is non-obvious from the code.

Should I send a detailed review off-list, or should I wait for this
class to mature a bit?

> > We have Jeremy willing and able to help with documentation. Does it make
> > sense for him to start with documenting this class while you continue
> > working on the implementation? It would be great to have the two of you
> > co-operating on this.
> 
> I'll talk to Jeremy when he comes back from vacation.  I don't think
> we have to write a precise document of this prototype
> interface/implementation at this stage (it will surely change).  But
> we can use it as a practice of how to make a useful document while
> writing code.

Sure, makes sense.

--
Shane