Natterings about history files

Russ Allbery rra at stanford.edu
Sat Mar 3 11:53:23 UTC 2001


Forrest J Cavalier <mibsoft at epix.net> writes:

> If you want to have multiple history hash tables instead, the big
> penalty is paid by nnrpd when doing lookups by message ID.

> Some people have article archives going back years. INN need not support
> a system that DejaNews can use, but a lot of sites have comp.* groups
> archived back pretty far. We don't want an nnrpd hitting 300 history
> files before it can say "oh, sorry.  Don't have that one!"

I'm not convinced that's a big penalty.  You start with the newest history
files first, since most requests are fulfilled out of that, and you're
generally going to combine older history files that are sparse to make
them less sparse.  You may be right, but I think it's going to take a very
weird setup to end up with more than 20-odd history files and searching 20
small history files for the unusual case and probably more frequently 1-5
isn't that bad.

> Or, you have per-day history files up to /remember/ and then bigger N
> article history files for the stuff you have archived.

Now that you mention it, that's probably the best way to deal with it.
Once a history file is older than /remember/, pull out all the entries
that are still live, dump them into the large history file, or several of
those if you have a lot of old articles, and delete the rest.  Then you
know that that final history file contains only live articles and you can
purge entries as those articles expire.

> At the time I was thinking about it, I didn't like any of the tradeoffs.
> It seemed less robust, and harder to tune.  But I think you could
> probably design something acceptable that allowed you to keep dbz.

> (Personally, I think dbz is part of the problem, so I was planning to
> replace it altogether.)

Oh, I agree, but I think that per-file indexes have some nice properties.
For example, one can only memory-map the more recent indexes and not
bother with the older ones that will be getting far fewer lookups.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>


More information about the inn-workers mailing list