[bind10-dev] scalability issue with SQL(ite3) select with 'like'

Mon Mar 5 09:18:46 UTC 2012

At Sat, 3 Mar 2012 13:59:25 +0100,
Michal 'vorner' Vaner <michal.vaner at nic.cz> wrote:

> > Is my understanding correct?  If so, is there any easy way (easy = by
> > not fundamentally changing the table organization) to get the result
> > we want to get more efficiently?  What we need to know is basically
> > that when we know there is no row whose name is
> > "nxdomain.example.com." whethere there is a row whose name is a pure
> > subdomain of it (e.g. "a.nxdomain.example.com.").
> 
> That is to find if the nxdomain.example.com is empty nonterminal,
> right?

Not only for that.  We use it to ensure that a wildcard match is
actually the best one:

Zone has *.example.com.
Query is a.b.example.com.

We need to make sure there's no name on or under b.example.com.  To do
so we use LIKE as  "rname LIKE com.example.b.%" (to be precise the
current SQLite3 data source implementation doesn't actually do it, and
the new version uses "name LIKE %.b.example.com."...but ignore these
subtle gaps for simplicity).

Anyway, I was actually wrong in saying with LIKE we cannot use
indices.  But the proposed change in ticket #324 still doesn't work
for several (one stupid, other more subtle) reasons:

- We specify "STRING" as the data type for "text-like" columns, but
  that's not a valid type name for SQLite3 (or SQL in general I
  suspect), and SQLite3 regard it as an "NUMERIC" affinity:
  http://www.sqlite.org/datatype3.html#affinity
  SQLite3 doesn't use indices for LIKE unless the corresponding column
  is of TEXT affinity:
  http://www.sqlite.org/optoverview.html
  (See Section 4.0 "The LIKE optimization")

  This is the stupid error.  We should have used TEXT instead of
  STRING from the beginning.

- But this trivial change still doesn't work, because we specify an
  expression for the right-hand side of LIKE:

const char* const q_count_str = "SELECT COUNT(*) FROM records "
    "WHERE zone_id=?1 AND rname LIKE (?2 || '%');";

  SQLite3 doesn't use indices for this query; it only uses indices
  "a string literal or a parameter bound to a string literal that does
  not begin with a wildcard character" (see again the above link).
  So, instead of concatenating the parameter and wildcard (%) in the
  template string, we should use a single parameter

const char* const q_count_str = "SELECT COUNT(*) FROM records "
    "WHERE zone_id=?1 AND rname LIKE ?2;";

  and have the caller append '%' to the string passed to
  sqlite3_bind_text().

  This is the subtle one.

But queries for a non existent name was still slow in my experiment,
and I figured out that bottleneck too.  It was this query:

const char* const q_previous_str = "SELECT name FROM records "
    "WHERE zone_id=?1 AND rdtype = 'NSEC' AND "
    "rname < $2 ORDER BY rname DESC LIMIT 1";

Because (I suspect) the result of "rname < $2" can be big and there's
no index on "rdtype", so SQLite3 needs to perform an expensive scan
for the large search space.  (If the comparison for name or rname is
an equality check it's reasonably fast - I guess this is because
SQLite3 uses the expensive scan only for the index search result, and
in the equality case the scan range is quite limited in practice).

This can be solved by adding this index:
CREATE INDEX records_bytype_and_rname ON records (rdtype, rname)
Note that the order of rdtype and rname is important.  Since the
comparison for rname is an inequality check, the corresponding column
in the index must be placed last (See section 1.0 of
http://www.sqlite.org/optoverview.html).  Also, simply adding an index
on "rdtype" doesn't really work well, because even though the indices
are (seemingly) used for searches themselves, SQLite3 creates a
temporary B-tree for the "ORDER BY" part.  So, while it's fast if the
zone is unsigned (because the tree would be empty), it will be quite
slow for a signed zone, for which this query is actually needed.

I've made these updates to the trac324new branch and pushed it to the
public repo (I've already started another task, so I'll release the
ownership for now).  If someone wants to complete this task, please
do.

Some additional notes:

- This query in sqlite3_accessor.cc (the new implementation) doesn't
  seem to be good for the same reason(s):
    "SELECT rdtype, ttl, sigtype, rdata " // ANY_SUB
        "FROM records WHERE zone_id=?1 AND name LIKE (\"%.\" || ?2)",
  it uses an expression, and it places the wildcard first ('%').
  We'll have to fix that, too.
- The bottleneck for "find previous" could be avoided for unsigned
  zones.  We should probably first identify whether we need to know
  the previous name before (i.e., whether the zone is NSEC signed)
  actually trying to find it.
- I also noticed this query is slow:
    "SELECT rdtype, ttl, sigtype, rdata, name FROM records " // ITERATE
        "WHERE zone_id = ?1 ORDER BY rname, rdtype",
  Because it builds a temporary B-tree for "ORDER BY".  Like the
  "previous" case, we could solve this by adding an index on (rname,
  rdtype) in this order.  But in this case I wonder we could rather
  omit the "order by rdtype" part.  Adding another compound index
  makes the DB file even bigger and would make update slower.  Unless
  we need this ordering by the API contract it's probably better to be
  less strict.

I plan to make tickets for (some of) these points.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.