[bind10-dev] b10-auth query performance: experiments and suggestions

Thu Jan 19 00:23:08 UTC 2012

At Wed, 18 Jan 2012 15:14:42 -0800,
JINMEI Tatuya <jinmei at isc.org> wrote:

> I'll explain further optimizations in a separate message.

So here are they.  I didn't measure the effect of each specific
optimization, but combining all of them (with the easier ones I
mentioned in the previous message), it improved the max qps for the
root server test scenario 3.4 times larger (faster) than the current
version of BIND 10, and it ran about 2 times faster than BIND 9.  It
required much less memory than the current BIND 10 implementation
(although it still needed much more than BIND 9, but I believe we can
improve that too without sacrificing response performance too much).

The experimental implementation is available at the jinmei-labelseq
branch of the public git repository.

The following are specific optimization ideas used in the branch:

1. Introduce a separate class to manipulate names
  I named it "LabelSequence".  In short, it's an efficient wrapper
  interface to a Name object, and allows the caller to get access to a
  specific sequence of labels a name (such as "www.example" of
  "www.example.com.") without involving expensive copy.  In query
  processing we often need to split a name, compare specific
  fragments of the labels, etc, so using this class instead of Name
  throughout performance sensitive paths will generally improve
  performance.  The rest of the optimizations more or less rely on
  this.

2. Use LabelSequence in the red-black tree(-like) in memory backend
  instead of Name::compare() or Name::split(), and store the keys of
  the trees in the form of native "wire" format for LabelSequence
  instead of bear Name objects.  This didn't seem to change lookup
  performance too much, but it should help reduce memory footprint
  very much.

3. Convert RDATA stored in the in memory data source using RdataFields
  class instead of holding them in the form of generic RRset objects.
  This will also massively improve memory footprint.

4. Update RdataFields so that the serialized name data will be in the
   native format for LabelSequence instead of wire-format name data,
   update MessageRenderer::writeName() to accept LabelSequence,
   and use RdataFields::toWire() to render RRsets stored in the in
   memory data source.  Compared to #3 this would increase (i.e. less
   optimized) memory footprint a bit, but help improve response
   performance, probably quite substantially.

We'll need something like these anyway, not for improving response
performance but for reducing memory footprint.  The current
implementation stored everything in the form of higher-level objects,
and very inefficient in terms of memory usage.  This change would also
be a necessary first step toward shared-memory type backend.

These are not super tricky, but would still require 2-3 sprints for
careful implementation and more detailed tests.  So I think it's
(candidate of) post y3 work as long as we can achieve BIND9-equivalent
performance with the easier optimizations.

---
JINMEI, Tatuya