INN multi-server thoughts.

Fri Feb 16 08:39:49 UTC 2001

Dan Merillat <harik at chaos.ao.net> writes:

> There's a number of people who've gone ahead and done this for their
> sites... generally meaning code fork and making it very difficult for
> them to update to the latest code.  Some of these forks go back to 1.4.1.
> 

I'm afraid we were one of them (and the code never went back);
needless to say, what we had then is now so divergent as to be of no
help.

> My thought is that the distribution from ID hash is going to be basically
> random.  Consider a binary "flood".  Very similar message-IDs and sizes,
> but the hash is going to distribute them pretty evenly.  Keep it Simple...
> if it turns out there's a serious load balancing problem then spend extra
> time coding.

Assuming the hash distributes evenly in the bits you sample - I'm
inclined to agree with you though (I guess its easy to prove by
scanning the log).

> Besides,  take the last byte of the hash and you've got 256 shares, so you
> can dynamically balance load by moving the percentages of articles from one
> machine to another. "ctlinnd load '1:128 2:64 3:64'", as an example.
> 

Thats a neat idea, great for bringing up a new server too.

> > I was proposing to implement an NNTP storage manager which would query
> > the appropriate backend storage server. The Xref header in each
> > message would indicate which storage server an article was located on.
> 
> Won't work, since there's no way to lookup a message by ID then.  You'd
> have to scan the entire overview database.  That was my first idea as well.
> 

I was planning on either having a history file/database on each reader
or a separate history server which would handle that; the Xref header
then gets munged into something which can be used as a TOKEN.

(Actually I've just realised a history server won't work in the face
of per reader server caches)

> I was going to use a variant on the timecaf code to cache articles on
> the storage server.  Modify to expire the files based on access time
> rather then modification time. (and expire if their SM token is expired
> on the main server.)
> 

My thoughts were these - when a header only article is received, store
it as an NNTP SM style token in this history, then when the body is is
retrieved, store it according to whatever storage.conf says and update
the history to reflect the new SM token (which we can do in place with
todays history).

> Of course, this means keeping a seperate database of access times for
> nnrpd to use.  It may be simpler to just throw a CNFS-type store on.
> Since the same 9 gig of articles get read every day, a 20 gig IDE drive
> on the reader will probably do quite well, even though the cache is
> FIFO.
> 

CNFS was what I was planning on using - if an article's in the cache
on a reader you get it, even if its gone from the main server, if its
not on the reader you fetch it & store it (quite how you handle that
bit I haven't figured yet).

-- 
Alex Kiernan, Principal Engineer, Development, Thus PLC