[bind10-dev] DNS message API: the message class

Thu Oct 1 08:22:42 UTC 2009

As I submitted a shorter report last night (PDT), I've committed the
first prototype implementation of the DNS message class with sample
code to show its feasibility.  This is a more comprehensive
description of the class and its related API design.
Comments/questions/suggestions are highly appreciated as usual.

General Design Goals:
  easy to use: this API is expected to be used by external developers,
    so the interface should be as simple and intuitive as possible.
  standard conformance: it should support all major protocol features,
    such as TSIG/SIG(0), EDNS, dynamic update.
  performance-aware: still, the underlying implementation should be
    able to be very efficient.  for example, we should allow a
    pre-compiled, wire-formatted data source to construct a DNS
    message based on its data with no or very few copies.
  extensibility: it should be able to (at least) support BIND9's
    feature set, such as rrset-order, preferred-glue, sortlist (even
    if these are not available in the 1st-year version).  It should
    also be possible (and easy) for an external developer to add their
    own extensions.

(ambitious goals, aren't they? :-)

Existing APIs:

I've referenced various existing libraries/APIs that provide similar
functionalities:
  - BIND9 libdns (the message module): good in the sense of
    conformance, but not easy to use and can be slower due to its
    redundant (generalized) design
  - ldns: looks good, but IMO we could provide more convenient APIs.
    it also (seemingly) misses some protocol features as I noted in
    the truncation discussion.
  - script based tools (dnspython and dnsruby): intuitive, but the
    implementation is straightforward (which is not a bad thing
    itself) and would run slower.  they'd also be slower due to the
    obvious reason: being a script.

The Class/API Description:
  Message class: this is a container of an abstract notion of a "DNS
  message".  A wire-format message is converted to/from this message
  object, and the application manipulates a DNS message through an
  instance of this object.  All existing APIs (see above) have a
  similar class/structure, so, it should be easy to understand what's
  the message class at a high level.

  See the (tentative) class definition in
  experiments/jinmei-messageapi/dnsmessage.hh of the svn repository.
  I believe the class methods (member functions) are generally
  straightforward.  These include:
    - accessor to the DNS header fields: {get,set}_qid(),
      {get,set}_{qr,tc,rd,ra,ad,cd}(), {get,set}_opcode(),
      {get,set}_rcode()
    - manipulate RRsets for a specified section
      (question/answer/auth/additional): add_rrset(), get_section(),
      etc.
    - rendering and parsing to/from wire format: to_wire(), parse()
    - other advanced features such as TSIG/EDNS (TBD)

  For extensibility, I've introduced some helper classes that are
  expected to be work with the message class:

  - MessageRenderer
    Encapsulates states and algorithm for rendering a message in wire
    format.  This is an abstract class, and the user will use a
    specific concrete subclass depending on their usage.  In the
    initial prototype I wrote a simple subclass
    "MessageRendererToBuffer", which just encapsulates a packet buffer
    (and something like this will be a "default renderer" when the
    user doesn't have a particular preference).  We may want to add
    more optional/conditional strategies to this or other subclasses,
    e.g., what to do when the message is too large.

    Another future possibility I have in my mind is an "iovector-based
    renderer" for higher performance.  It will be used with a
    pre-compiled, wire-formatted data source, and build a packet as a
    vector of pointers to the pre-formatted data.  We'll then use
    sendmsg() with this vector, thereby minimizing data copies.

  - NameCompressor
    Encapsulates states and algorithm of DNS name compression.  This
    class will be used with a renderer and a message.  In the initial
    prototype version, it's just empty (i.e., name compression isn't
    supported).  A straightforward implementation would be something
    like BIND9's dns_compress_t.  A performance aware version should
    also be possible (although I don't have a specific idea)

  - RRSetsSorter
    Encapsulates the algorithm of how to sort RRsets in a given
    section (for rendering).  I'm intending to use this as a
    generalized algorithm class for the "preferred-glue" option (A
    then AAAA or vice versa).  BIND9 hard-codes the logic of this
    option in the rendering function, but I thought we should separate
    the logic into a dedicated class for future extensibility and for
    making the main rendering code logic simpler.   See
    Message::to_wire() defined in dnsmessage.cc to understand how this
    class would be used.

  - MessageParser
    Encapsulates states and algorithm for parsing a wire-format
    message to construct a DNS Message class instance.  Like
    MessageRenderer, this is an abstract class, and a specific
    subclass will be used as an actual parser, depending on the user
    usage.

    I've not given a complete example subclass in the initial
    prototype.  But a straightforward implementation would be, well,
    straightforward:-) it would parse the wire-format RRs one by one,
    construct an RRSet instance containing concrete Rdata instances
    for the RRs, and add it to the message.  A higher performance
    version would probably delay the actual instantiation and just
    maintain a pointer or something to represent the internal
    structure.

Performance Considerations:

  Some performance related considerations are given above.  Here's one
  more point: I considered how we can optimize the rendering process
  when we use optimized data source.  If we convert the source data to
  a generic RRset containing a list of generic Rdata, it will be
  pretty expensive and slow.  What I'm currently imagining is
  something like this:
  we have a specialized (sub)class of RRset.  It has the same
  interface as the generic RRset, which contains name/rdclass/rdtype
  and list of rdata in a straightforward way, but the specialized
  version implements the data more efficiently:

  SpecialRRset {
  public:
      to_wire();  // same interface
  private:
      char* name; //already wire-format
      char* commondata; // wire-format class, type, TTL
      char* rdata;  // list of rdata in wire-format
  };

  and, while the generic RRset's to_wire() would call the to_wire()
  method of its internal name, TTL, rdclass, rdtype instances and
  iterate over its internal list of Rdata (for its to_wire() method),
  the special version's to_wire() would directly copy (or even just
  make a reference to) its already wire-formatted data stored in the
  corresponding data source.

  This is actually similar to what the recent versions of NSD does, if
  I understand the code correctly.  So, if we can do this well, we'd
  be able to achieve at least the same or close level of response
  performance as NSD while still providing more generic and intuitive
  APIs.

---
JINMEI, Tatuya