Ternary search trees added to TODO

Mon Apr 23 11:57:39 UTC 2001

> Added to TODO, idea from some discussion on the gcc mailing list:
> 
> * INN currently uses hash tables to store the active file internally.  It
>   should probably use ternary search trees instead; the data structure is
>   simpler, performance is comparable or better for hits and significantly
>   better for misses, sizing and resizing becomes a non-issue, and the
>   space penalty isn't too bad.  For more information, see
>   <http://www.ddj.com/articles/1998/9804/9804a/9804a.htm>.  A generic
> 

I wasn't part of the discussion on the gcc mailing list.
But the advantages you list are a little too optimistic.

Data structure may be simpler, algorithms are not.  Performance
is worse, unless your hash table is too small with a bucket size
of 1.  And sizing and resizing matter in INN for other reasons.

I don't object to using ternary search trees for item
lookup.  They have advantages.  The hash table used now
could be better (handling collisions is where dbz goes
wrong too.  Any hash table with a bucket size of 1 is
going to be very poor above 70% full.)

I agree INN would benefit from serious work here.  There are lots
of performance problems with closing and opening channels for
each newgroup/rmgroup.  The LIST ACTIVE result has to match what
is in the spool and returned by XOVER and friends, (and since INN
1.x there is a mismatch when there is an overview channel.)

---------------------------------------------------------
Let's remember that all O(1) operations are not equal,
and lookups are O(lg n) in a ternary search tree.