[bind10-dev] scalability issue with SQL(ite3) select with 'like'
Shane Kerr
shane at isc.org
Tue Mar 13 14:42:01 UTC 2012
Jinmei,
Sorry for the late reply. I'd been meaning to give a detailed look at
this since you posted it, but haven't carved out the time yet.
On Monday, 2012-03-05 01:18:46 -0800,
JINMEI Tatuya / 神明達哉 <jinmei at isc.org> wrote:
> - We specify "STRING" as the data type for "text-like" columns, but
> that's not a valid type name for SQLite3 (or SQL in general I
> suspect), and SQLite3 regard it as an "NUMERIC" affinity:
> http://www.sqlite.org/datatype3.html#affinity
> SQLite3 doesn't use indices for LIKE unless the corresponding column
> is of TEXT affinity:
> http://www.sqlite.org/optoverview.html
> (See Section 4.0 "The LIKE optimization")
>
> This is the stupid error. We should have used TEXT instead of
> STRING from the beginning.
Actually this turns out not to be an error at all. SQLite treats STRING
declarations like TEXT declarations:
sqlite> create table foo (bar string);
sqlite> insert into foo values ("baz");
sqlite> select * from foo;
baz
sqlite> select typeof(bar) from foo;
text
> - But this trivial change still doesn't work, because we specify an
> expression for the right-hand side of LIKE:
>
> const char* const q_count_str = "SELECT COUNT(*) FROM records "
> "WHERE zone_id=?1 AND rname LIKE (?2 || '%');";
>
> SQLite3 doesn't use indices for this query; it only uses indices
> "a string literal or a parameter bound to a string literal that does
> not begin with a wildcard character" (see again the above link).
> So, instead of concatenating the parameter and wildcard (%) in the
> template string, we should use a single parameter
>
> const char* const q_count_str = "SELECT COUNT(*) FROM records "
> "WHERE zone_id=?1 AND rname LIKE ?2;";
>
> and have the caller append '%' to the string passed to
> sqlite3_bind_text().
>
> This is the subtle one.
I'm not 100% sure, but I think this is also not an issue. If I compare
the opcodes that SQLite produces I don't see a problem here:
sqlite> create index bar_idx on foo (bar);
sqlite> explain select * from foo where bar like 'xxx%';
0|Trace|0|0|0||00|
1|Goto|0|13|0||00|
2|OpenRead|0|2|0|1|00|
3|Rewind|0|11|0||00|
4|String8|0|2|0|xxx%|00|
5|Column|0|0|3||00|
6|Function|1|2|1|like(2)|02|
7|IfNot|1|10|1||00|
8|Column|0|0|4||00|
9|ResultRow|4|1|0||00|
10|Next|0|4|0||01|
11|Close|0|0|0||00|
12|Halt|0|0|0||00|
13|Transaction|0|0|0||00|
14|VerifyCookie|0|6|0||00|
15|TableLock|0|2|0|foo|00|
16|Goto|0|2|0||00|
sqlite> explain select * from foo where bar like 'xxx'||'%';
0|Trace|0|0|0||00|
1|String8|0|2|0|xxx|00|
2|String8|0|3|0|%|00|
3|Concat|3|2|1||00|
4|Goto|0|16|0||00|
5|OpenRead|0|2|0|1|00|
6|Rewind|0|14|0||00|
7|Copy|1|4|0||00|
8|Column|0|0|5||00|
9|Function|1|4|3|like(2)|02|
10|IfNot|3|13|1||00|
11|Column|0|0|6||00|
12|ResultRow|6|1|0||00|
13|Next|0|7|0||01|
14|Close|0|0|0||00|
15|Halt|0|0|0||00|
16|Transaction|0|0|0||00|
17|VerifyCookie|0|6|0||00|
18|TableLock|0|2|0|foo|00|
19|Goto|0|5|0||00|
It is possible that the parameter code within the SQLite C library
breaks this, but it seems like a strange anti-optimization to me. I
haven't timed things, but my theory is that this is a non-issue.
> But queries for a non existent name was still slow in my experiment,
> and I figured out that bottleneck too. It was this query:
>
> const char* const q_previous_str = "SELECT name FROM records "
> "WHERE zone_id=?1 AND rdtype = 'NSEC' AND "
> "rname < $2 ORDER BY rname DESC LIMIT 1";
>
> Because (I suspect) the result of "rname < $2" can be big and there's
> no index on "rdtype", so SQLite3 needs to perform an expensive scan
> for the large search space. (If the comparison for name or rname is
> an equality check it's reasonably fast - I guess this is because
> SQLite3 uses the expensive scan only for the index search result, and
> in the equality case the scan range is quite limited in practice).
>
> This can be solved by adding this index:
> CREATE INDEX records_bytype_and_rname ON records (rdtype, rname)
> Note that the order of rdtype and rname is important. Since the
> comparison for rname is an inequality check, the corresponding column
> in the index must be placed last (See section 1.0 of
> http://www.sqlite.org/optoverview.html). Also, simply adding an index
> on "rdtype" doesn't really work well, because even though the indices
> are (seemingly) used for searches themselves, SQLite3 creates a
> temporary B-tree for the "ORDER BY" part. So, while it's fast if the
> zone is unsigned (because the tree would be empty), it will be quite
> slow for a signed zone, for which this query is actually needed.
In the old trac324 branch the following indexes were created:
CREATE INDEX records_byname
ON records (zone_id, name, rdtype, sigtype, id, ttl, rdata);
CREATE INDEX records_byrname
ON records (zone_id, rname, rdtype, name);
CREATE INDEX nsec3_byhash
ON nsec3 (zone_id, hash, rdtype, id, ttl, rdata)
These address this problem, although in a very slightly different
manner. That's why I pointed them out in the comment on the ticket.
> I've made these updates to the trac324new branch and pushed it to the
> public repo (I've already started another task, so I'll release the
> ownership for now). If someone wants to complete this task, please
> do.
>
> Some additional notes:
>
> - This query in sqlite3_accessor.cc (the new implementation) doesn't
> seem to be good for the same reason(s):
> "SELECT rdtype, ttl, sigtype, rdata " // ANY_SUB
> "FROM records WHERE zone_id=?1 AND name LIKE (\"%.\" || ?2)",
> it uses an expression, and it places the wildcard first ('%').
> We'll have to fix that, too.
This is a more serious issue, and the reason we have the RNAME field.
It is possible in more sophisticated SQL like PostgreSQL (or presumably
Oracle) to index on a reverse string so a prefixed LIKE will be
indexed, but I doubt that this is true on SQLite, and we certainly
don't do it.
> - The bottleneck for "find previous" could be avoided for unsigned
> zones. We should probably first identify whether we need to know
> the previous name before (i.e., whether the zone is NSEC signed)
> actually trying to find it.
Certainly not doing a lookup at all seems like a reasonable
optimization. :)
> - I also noticed this query is slow:
> "SELECT rdtype, ttl, sigtype, rdata, name FROM records " //
> ITERATE "WHERE zone_id = ?1 ORDER BY rname, rdtype",
> Because it builds a temporary B-tree for "ORDER BY". Like the
> "previous" case, we could solve this by adding an index on (rname,
> rdtype) in this order. But in this case I wonder we could rather
> omit the "order by rdtype" part. Adding another compound index
> makes the DB file even bigger and would make update slower. Unless
> we need this ordering by the API contract it's probably better to be
> less strict.
The indexes from the original trac324 should address this issue.
> I plan to make tickets for (some of) these points.
I don't recall seeing these... did you create these tickets? Certainly
fixing the LIKE '%'||?2 issue and avoiding any lookups at all if the
zone are unsigned seem useful.
Anyway, we'll discuss this on our call today.
Cheers,
--
Shane
More information about the bind10-dev
mailing list