pure transit server

Thu Feb 8 00:17:27 UTC 2001

Fabien Tassin <fta at sofaraway.org> writes:

> I try to keep filtering possible. I will probably not write it myself
> but I'll add a hook for it. Probably using an API. If you have some
> precise requierements, please, speak now.

I should work on fleshing out that filtering API that I had floating
around in my head, then, shouldn't I?  Hm.

I think all one needs from the server standpoint is a hook to pass an
article to a filter that returns either "okay" or some rejection message.
We can elaborate on that later, but that should work for starters.
Ideally, the filter should also be able to write to history (so that one
can pre-reject the cancel for spam that has already been filtered).

The message ID filter will be harder and can be put off for the first
pass, since it's not nearly as widely used or as widely useful.

> I've started a history API hiding everything to INNT. WIP and cache are
> not visible but I want more.

We'll want to combine that with Alex's ideally down the road, if we can
make the API clean enough to be useful in either architecture.

> I'm thinking of a way to avoid expire.  A kind of thread dedicated to
> self-expiring history based upon its size and upon hierarchy local
> prefs. Avoiding make(history|dbz) would be nice too but it doesn't fit
> into my current model.. still searching.

For a pure transit server, all you need to store in history is the hash of
the message ID and the time.  All expiry of history for a transit server
should be based on the equivalent of /remember/.  There's no need that I
can think of to store the article Date time, the expire time, or any
storage tokens; you don't care about any of that stuff for the kind of
server you're designing.  (That means that HEAD and STAT won't work, but I
think that's a reasonable price to pay for the conceptual simplicity...
more annoyingly, though, debugging will be harder, although that could
possibly be fixed by writing the storage location to the trace log file.)

I've tried to come up with a bunch of alternate history designs, but it's
actually really hard.  The constraints are:

 * You want all of your history writes to be either sequential (and
   bufferable) or into memory mapped pages, since otherwise you take a
   huge performance hit from seeking all over the disk all the time to
   make individual small writes.  This is one of the reasons why the
   current history database is a bit too slow.

 * Most of the lookups that make it through the cache will be misses.  You
   have to be able to say that something is *not* in the history file
   quickly, since that will be a large percentage of the queries the
   backend has to process.

 * You can't predict what you're going to be looking up (no perfect
   hashes).

 * You need to use a hash of reasonable size; what we're currently doing
   really isn't large enough.  Right now, we're only using 48 bits out of
   MD5's 128 bits, and I think we're running a non-trivial risk of serious
   hash collisions.

If you're writing to a sequential file, such as the current history file,
then you have to store the off_t values in some sort of index file, which
makes that index file larger.  You really want an index file that can live
entirely in memory to make your lookups faster.

The full MD5 hash is 16 bytes, which means a history entry with just
arrival time and the hash is going to be 20 bytes.  Figure planning for
twenty million history entries at a minimum, that gives you about a 400MB
history file.  If you want that file to be an actual hash table, it has to
be sparse, so it ends up being 600MB.  Probably too large to keep in
memory.  :/

>> It was mostly in the interests of making it easier to test.  One could
>> make it listen to innd pretty easily and with fairly minor changes and
>> it could be tested in isolation from a new innd to be sure of where
>> problems lie.

> this will require a lot of memory copies or disk reads that can be
> avoided in a single process. OTOH, testing is important too.

The theory was that you could do both; the same code could run either as
part of inntd or as a separate process.  Running as a separate process
only requires putting a little wrapper around the same code to read in
storage tokens, mmap the article, and then hand it off to the feed threads
just like inntd would.

> My article structure is very different from the existing one. So I'm not
> able to use the existing code as is. This is probably a good thing
> because I'm not satisfied by the current model. Too many indirections
> and conf files, in particular.

Cleanups are definitely good things.

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>