Ternary search trees added to TODO
Russ Allbery
rra at stanford.edu
Mon Apr 23 12:26:32 UTC 2001
Forrest J Cavalier <mibsoft at epix.net> writes:
> I wasn't part of the discussion on the gcc mailing list. But the
> advantages you list are a little too optimistic.
> Data structure may be simpler, algorithms are not. Performance is
> worse, unless your hash table is too small with a bucket size of 1.
That's not clear; I'd like to at least try it. A lookup in a hash table
is linear in the length of the key, so it's O(k) vs. O(lg n + k), which
makes the performance difference a little harder to judge. Plus
unsuccessful lookup is frequently much faster, and the operation at each
character of the key can be quite a bit faster; it depends on whether
pointer chasing is faster than the mathematical operations the hash
function is performing.
> In a ternary search tree, the scalar comparison is much more costly than
> a tight strcmp loop.
But a hell of a lot faster than the hash function.
> Sizing and resizing will still be an issue. The order of entries in the
> active file must be preserved so that it can be MMAPed, and the active
> file is not necessarily sorted. So the "sorting" that you get from a
> ternary search tree isn't actually useful.
The only thing we can really store in the search tree is a pointer to the
appropriate line in the memory-mapped copy of the active file; I don't see
any way of fundamentally changing the structure of the active file without
a lot more pain.
> Again, I don't object to that note in TODO, but it sure isn't clear to
> me that ternary search trees are "the" solution to use.
I'll clarify. Thanks!
--
Russ Allbery (rra at stanford.edu) <http://www.eyrie.org/~eagle/>
More information about the inn-workers
mailing list