Assorted history questions
rra at stanford.edu
Sat Mar 3 11:16:40 UTC 2001
Alex Kiernan <alexk at demon.net> writes:
> grephistory can't support -h or -t (offset & hash aren't exposed by the
> API), also you can't search by hash any more; these don't look like very
> serious restrictions.
Agreed. I think both of those can be dropped.
> prunehistory looks like it can be done with HISreplace (which I haven't
> implemented yet), but does anyone actually use prunehistory (actually I
> guess the answer is, if its there someone will have found a use for it
> in the past).
Hm. prunehistory would be useful for something like the thdexpire program
in contrib, at least in theory, by updating the history file at the same
time as expiring the articles. But that wouldn't work while INN was
running. I can't see a lot of utility, and it's easily reimplementable
with the new API.
> expire can't support -d (specify new directory),
We definitely need to support this functionality; some history files will
need plenty of temporary space and I need to use this option now on one of
> -f (specify new history base name),
We probably need some way of generating a separate history file not in the
default location, but -d may be sufficient.
> -i (ignore old database)
I don't really understand why this would be needed or useful... if you're
ignoring the old database, why not just create a new empty one first?
> or -x (don't write);
This looks useful for some purposes, but only as a nice additional
feature, not part of the basic functionality. It can be easily skipped
for the first pass.
> makehistory when running without -x creates the history and the
> associated dbz files; it could do with some extra dbz specific flags in
> the HISopen though (like telling it to create the dbz in core).
I wonder if this needs to be a dbz-specific option or if there's some way
of generalizing it. Maybe a few levels of "map index files"? Hm.
> nnrpd's checking of the current history file now appears in HISlookup,
> but every application gets this (right now it just stats the history
> file at no less than 30s intervals, that bit needs to be configurable),
> but should their be some way of defeating this for innd (and expire,
> grephistory etc.), should it be explicitly enabled instead?
Yes, this should be a configurable option.
> How was `threshold' to HISexpire expected to work - I'm assuming upper
> bound on the arrival time for articles passed to the callback; if you
> tell expire to do decisions based on posting time you need to pass in
> "infinite" (which I guess probably means 0 as a special case).
Hm. The current history format has the posting time as well as the
arrival time. My initial thought was that you'd give HISexpire a
threshold and articles newer than that never even get passed to the
callback, but that's probably premature optimization. The callback can
just as easily check the time and immediately return. I don't think
threshold is actually useful.
> Need to think of a design for replacing initial database creation (or
> should it just do like ovdb & create the files it needs, sizing a
> problem then though I guess).
I'd really like an explicit creation step, even if optional, so that
there's some place that someone can select initial options. Size is just
one of the things that I can think of; some history formats may want to
know the retention time of articles at the time of creation, for example.
Russ Allbery (rra at stanford.edu) <http://www.eyrie.org/~eagle/>
More information about the inn-workers