[bind10-dev] some ideas to improve query performance of DB-based datasrc

Mon Oct 1 08:41:23 UTC 2012

Not related to short term goals, but I've come up with a few ideas of
enhancing query response performances for DB-based datasrcs (which
currently means SQLite3, but some of them apply to other types of
backends).  I've actually played with the ideas over the last weekend,
and found these ideas are pretty effective.  In some simplest cases,
the revised version of SQLite3 backend handles queries more than 87%
faster than the current implementation.

The basic observation is (while obvious in some sense) that the major
bottleneck is queries to the DB backend.  Conversion from textual result of
DB query to native libdns++ objects is also pretty expensive.  So what
we should do is to minimize these per DNS query processing.

One known way is to cache previous results and reuse them for heavily
asked queries.  While I don't deny its effectiveness (and, depending
on the required level of performance, there can be cases where caching
is the only feasible solution), caching has its own drawbacks such as
making the system less predictable (due to the managing overhead of
cache).  It also heavily depends on the query pattern.

My ideas try to eliminate or minimize such drawbacks, and consist of
the following independent optimizations:

1. Reduce the number of DB queries to identify the best match zone
   (see below)
2. Use numeric values to represent RR types (not text)
3. Delay the conversion from textual DB result to native C++ objects
   as much as possible
4. Enable "minimal-responses" by default: omit the authority and
   additional sections unless really necessary or explicitly
   configured
5. Cache some information at the origin node of zones and use it to
   save some DB queries.  Specifically, whether the zone is signed
   (and if so with NSEC or NSEC3) or it has DNAME RR at the apex.
6. Cache zone's NS and their address records and use it if
   "minimal-responses" is disabled (see optimization #4) and these
   records are necessary for positive answers.

Some more details about these ideas:

#1: Currently, we try to find the longest matching zone from the
"zones" table, starting from the original query name, then removing
"left most" label one by one.  So, unless the query name is the origin
we at least need to perform two DB queries to identify the zone.  But
I found (unless I miss something) we could reduce it to just one time
regardless of the number of labels of the query name, similar to the
way identify the "previous name" for DNSSEC.  If the "zones" table has
"r(everse) name" like the "records" table, we can identify the best
possible row by selecting with the condition of
  "rname<='query name' ORDER BY rname DESC LIMIT 1"
For example, if the zones table has example.com and example.net, and
the query name is www.example.com, we can correctly identify
example.com because net.example. > com.example.www > com.example.

We need to be careful, though, so we don't accidently "find"
example.net for the query name of "www.example.org".  We can do it
with an extra check at the caller side so the zone name is really a
super domain (including the equal case) of the query name.

By doing #4, we can save DB queries for the origin and NS names for
"leaf zone" type queries.  Alternatively, if we do #6, we can still
return a complete response with the authority and additional sections
without involving extra DB queries.  Unlike caching general query
results, zone's NS and their additional records can be used for any
queries (and would be relatively stable), so the cache effect will
be higher.

The main purpose of #5 is to skip a DB query at the origin name just
to check if it has a DNAME.  Since a zone doesn't even have any DNAME
in many cases, caching this fact should be very effective; we might
even allow the admin to specify it as a configuration option.

Combining all these, we can reduce the number of DB queries to 2 if
the query name exists in the zone and one-label longer than the
origin.  I guess this is one common case for "domain hosting" type of
services (where the server has a large number of small zones, and each
zone normally has very simple names like www + <origin>).

I've experimentally implemented some of these ideas and performed some
simple benchmark tests.  I also implemented quick-hack MySQL and
PostgreSQL backend accessors and measured the performance.

The benchmark scenario is simple: set up a server having two zones,
and repeat a query for the same single name in one of the zones.  The
name is one-label longer than the zone origin (e.g. zone name =
example.com, query name is www.example.com).

The results for the SQLite3 backend are:
- Current version: 3600 max qps
- Optimized (without DNAME skip): 6761 qps (87% faster than current)
- Optimized (with DNAME skip): 9093 qps (252% faster than current)

Optimized MySQL backend (without DNAME skip): around 1800 qps
Optimized PostgreSQL backend (without DNAME skip): around 1500 qps

If I enabled query cache for MySQL, the result was much better, but
for a single-query test it's probably misleading because the cache hit
rate should be "too good".

This was done on my laptop, so it would be better than this for a
dedicated server machine with many cores.  And that suggests it won't
be difficult to achieve several thousands max qps with DB backend for
some kind of deployment scenario even if we don't do any caching.

---
JINMEI, Tatuya