[bind10-dev] DNS packet API: the name object

Mon Aug 10 17:31:56 UTC 2009

Jinmei,

Nice to see things getting started. :)

On Thu, 2009-08-06 at 14:58 -0700, JINMEI Tatuya / 神明達哉 wrote
> 
> Here is a summary of the new "Name" class:
> - constructed from a string (e.g. "www.example.com")
> - supported methods:
>   + conversion: to_string()
>   + transformation: split(), concatenate()
>   + properties: is_wildcard(), is_absolute(), countlabels()
>   + comparison: fullcompare(), ==, !=, <=, >=, <, >

The is_absolute() method seems like it will complicate things.

Do we really need to handle non-absolute names? Perhaps it is better to
leave the absolute/relative name handling to a different class?

If we do want to handle non-absolute names, do we need a way to create a
context which says what the current domain is (or the current search
list is)? In that case it would be something like:

	explicit Name(const string &namestr, const SearchList *sl = NULL);

BTW, I'm using the terminology I get from the resolv.conf man page,
which specifies multiple domains on a search list declaration.

Regarding the comment in the code about a "from-wire" constructor...
there are two different levels I can see this at:

     1. Something that takes a simple length/data/length/data/'\0' DNS
        encoded name and uses that for constructing.
     2. Something that takes a data blob (presumably a DNS packet) with
        DNS compression and everything, plus a pointer into that blob,
        then builds a DNS name out of that.

The first is straightforward, but probably less useful (well, useful for
an authoritative-only server perhaps!). The second is... trickier,
right? I see the value in it, but maybe it makes more sense to provide a
helper function that takes a vector<> of strings, each of which is a
label. We could keep the actual concerns about boundary checking and
such in their proper place (a packet-parsing class), but still have a
relatively efficient way to build a Name object.

> Other design choices:
> - for conversion and transformation, it generally returns a new
>   object.  For example, to_string() returns a new (std::)string object
>   representing the name as string.  This will be convenient for the
>   API user (unlike BIND9's libdns) in that the caller doesn't have to
>   prepare the new object beforehand.  Also, since all dynamically
>   allocated data is encapsulated, we don't have to worry about
>   resource leak (too much).

Makes sense. If it becomes a bottleneck we can revisit this.

> - one possible concern with this approach is that it may tend to
>   require data copy more frequently than other approaches (such as
>   pointer/reference-based design) and may cause non-negligible
>   performance penalty.  This may be a non-issue in practice, however,
>   if the underlying containers are carefully designed/implemented
>   (e.g., multiple instances of std::string can share the actual data,
>   making copies cheaper).  In any case, we'll eventually need to do
>   benchmark.

Yup. The STL is defined in such a way to allow reference counting
implementations to be easily supported, although it is not a
requirement. Benchmarking make sense.

Speaking of benchmarking... do we have performance targets for this
class? If we intend to use it in packet processing, we can do some
guesses as to what is acceptable based on that.

So, for example, if we assume an authoritative-only server needs to
answer 100K queries a second on our as-yet-undefined target, and it
creates an average of 10 names, with 50 comparisons, then we would want
to be able to do something like 10M creations and 50M comparisons a
second on that machine if we can afford to spend 20% of our time doing
name processing. I have no idea if these numbers make sense, of
course. :)

> - the fullcompare() method is similar to libdns's
>   dns_name_fullcompare(), but returns an abstract "result" object,
>   encapsulating the comparison result, ordering as an integer, and the
>   number of common labels.  dns_name_fullcompare() took pointers to
>   integers as arguments for the latter two, but I believe returning a
>   single result is more intuitive for the API user.

Makes sense. Do we need functions that do less, for example an explicit
is_subdomain_of() function? Also, I'm not really clear what the various
name relations are (I don't know what "contains" is compared to
"subdomain" for example).

> - in general, error cases trigger C++ exceptions, rather than an error
>   code as a return value.  I know this is a controversial design
>   decision and I'm not sticking to this approach.  I'm open to other
>   suggestions on this.

I think it makes sense. I think the guideline should basically be things
that you don't expect to see very often should be exceptions. The
advantage of exceptions is that you *notice* if you forgot to handle
them, unlike return codes which often get silently (unintentionally)
ignored.e

BTW, we need a NameTooLong exception, right?

We have Jeremy willing and able to help with documentation. Does it make
sense for him to start with documenting this class while you continue
working on the implementation? It would be great to have the two of you
co-operating on this.

--
Shane