Natterings about history files

Thu Mar 1 17:30:57 UTC 2001

Russ Allbery <rra at stanford.edu> writes:

> bill davidsen <davidsen at tmr.com> writes:
> 
> >   Perhaps this could be delayed until some time after the article (or
> > cancel for the article) is received, to avoid lookup delays. But after a
> > day or so, it's unlikely that the info will be needed often enough to
> > make the performance hit offset the gain in instant expire.
> 
> I went down that road too in my head, and then realized that the problem
> is that if you're accepting a new article that you've never seen before,
> you have to do a lookup in *all* your history files, or otherwise you may
> as well not keep them.  So it's not true that older history files get
> fewer lookups, unfortunately; new articles are the common case of stuff
> that makes it through the cache.  :/
> 

One option might be to shortcut the start of the expire, since the
history will normally be sorted by arrival time. When running the
expire, note down where tomorrows expire will start checking entries
(i.e. skip over the ones we'll just delete), then tomorrow, start from
that point in the history file, rather than at the beginning.

It loses some robustness though & I'm not convinced the overhead in
reading through those lines is so large as to make it a worthwhile
optimisation.

-- 
Alex Kiernan, Principal Engineer, Development, Thus PLC