Natterings about history files

Forrest J. Cavalier III mibsoft at epix.net
Fri Mar 2 05:57:20 UTC 2001


> come of it. I don't think keys need be deleted, the data entry could
> just be marked gone-walkabout until the file is cleaned.
> 

Even though there were multiple history files, my on-paper design
has one hash table, where each key lookup indicated the address
of the datum (file number and offset.)

If you want to have multiple history hash tables instead,
the big penalty is paid by nnrpd when doing lookups by message ID.

Some people have article archives going back years. INN need not
support a system that DejaNews can use, but a lot of sites have
comp.* groups archived back pretty far. We don't want an nnrpd
hitting 300 history files before it can say "oh, sorry.  Don't
have that one!"

So this is where the art of tradeoffs happens if you
want to go with multiple hash tables.

Instead of once per day history files, maybe do it as N
articles per history.  (Then it gets a little bit harder
to determine the date bucket you need to look in.)

Or, you have per-day history files up to /remember/ and then
bigger N article history files for the stuff you have archived.

At the time I was thinking about it, I didn't like any of the
tradeoffs.  It seemed less robust, and harder to tune.  But
I think you could probably design something acceptable
that allowed you to keep dbz.

(Personally, I think dbz is part of the problem, so I was
planning to replace it altogether.)

1. INND must be able to do fast lookups for most incoming
  article decisions, without requiring infinite memory.

2. NNRPD must be able to do a message-ID access
  (to success or failure) in a reasonable number of I/O's

3. INND must be able to cancel articles in a reasonable
  number of I/O's

4. EXPIRE must be able to deal with all the different
  storage methods.  You have to be able to expire.  (This
  may require the ability to enumerate articles.)  Continuous
  expire is preferable to "stop everything and expire."
  Must not require infinite time or memory.

5. makehistory can build the history file from spool.
  And there must be a makedbz equivalent to rebuild hash
  tables from history.
  (Again, without requiring infinite time or memory.)





More information about the inn-workers mailing list