INNT overview..

Mon Feb 12 00:00:23 UTC 2001

As promised, here is a quick-n-dirty overview of the project I'm working on.
It contains a lot of questions and affirmations. Feel free to answer if
you disagree or if you want to add your brick to the wall.

The initial idea (2 or 3 weeks ago) was to create a fast multi-threaded NNTP
server dedicated to transit and then, without support for readers or non-vital
transit tasks (INN already exists for that). Adding threads support in INN is
very hard, not to say impossible. I've prefered to start from cratch taking
ideas both in the current code and in the list of frustrations INN users and
hackers have accumulated for years. This resulted in something called INNT
standing for InterNetNews Transit(er?).

First, some open issues...

As time passes, it seems like reader support will not be impossible and that
it can be used as a basis for the next generation of INN. My implementation
is far from complete but I want to explain as early as possible what I'm
trying to do. *IF* INNT is used in INN in the future, should we try to re-use
the code of INN at all costs ? I'm not sure of the good answer.
I don't want to rediscover the wheel but for what I've already done, it
was far easier to start from scratch. This can be a problem because
I now need some functionnalities that I don't want to rewrite such as
history and storage systems. As almost all code in INN depends upon a lot of
things like libinn plus several header files and are not designed to be
used alone (not to speak of the thread-safe issue), it is not always an easy
task.
New code is also a problem for maintenance. CURRENT and STABLE are quite
similar and a bug fixed in one can often be ported in the other quite easily.
This will not be true for INNT and this brings the question: should I/we limit
the design and/or the code of INNT to fit INN ?
Use of macros is also questionnable. Russ did a lot of work in the macro and
compatibility systems in INN. I will probably use the compatibility code
(strerror, strcasecmp, snprintf, etc.) but I'm not fond of macros
such as NEW/DISPOSE/SIZEOF if it adds nothing. I see interest of macros for
systematic error checking and I will probably use some for that in a near
future.

Now the design..

- multi-threaded daemon using POSIX Threads (pthreads)
- the main thread is a loop waiting for incoming NNTP connections and doing
  the periodic tasks such as statistic creation. [done]
- once an incoming connection is detected, a new thread is launched. [done]
  Some checks are made: max connection not reached (both global and per peer)
  and peer is allowed (by name or address). [almost done but I still need
  an incoming conf file to describe this]
- only a few NNTP commands are supported, including streaming:

200 news InterNetNews Transit NNTP Server INNT version 0.01 ready
help
100 Legal commands
 help
 list
 mode stream
 ihave <msg-id>
 check <msg-id>
 takethis <msg-id>
Report problems to <usenet at sofaraway.org>

  [all of these are implemented]
- no active file supported. "list" returns a fake list with only control and
junk. It should not be a problem to have an active file by using an external
process (or even a dedicated thread) fed with headers and control messages.
If reader support is wanted, overviews can also be generated that way.
- configurable timeout on read.
- configurable incoming streaming queue size.
- central (locked) history checks (include a builtin precommit cache)
  Should we remove unneeded fields in history ? I'm currently using stock
  dbz.c and my own API waiting for something better. I plan to use expire
  for a while too.
  Should I check message-id syntax ?
- on the fly article preparation:
   - "\r\n"
   - Path updated
   - Newsgroups and Distributions headers splitted
   - Xref removed if present
  no other fields are checked/changed.
- articles are stored dated and ref counted (against outgoing wishes) in
  a central queue (memory)
- a filter can be called on each article reseting its ref count if rejected.
  It is not clear whatever the filter should only avoid an article to be stored
  locally or also block its propagation to *all* other peers. It can be
  desirable to have both filtred and full outgoing peers.
- each outgoing peer is a thread. This thread monitors the central queue
  for articles and try to propose them to peer based upon user preferences
  (number of connections, max checks, etc.). Ref count is decremented when
  an article has been succesfully sent.
- if a ref count becomes zero, the article can be removed from the central
  queue. I don't know if on a transit only server, these articles should
  be stored on disks or not. Perhaps it can be configurable.
- if an article stays in the central queue for a too long period or if the
  queue is full, it is stored on disks and added to the backlog queue of
  each peer that still needs it. Should we store them sooner and replace
  the memory copy by an mmaped copy ?
- outgoing wishes (given in an equivalent of newsfeeds) can be:
  - groups including poison and groups count
  - distributions
  - pathhosts or path size
  - full article vs headers vs path only
  - size or size range
  See my recent syntax proposal for "newsfeeds.conf" merging incoming.conf,
  newsfeeds and innfeed.conf.
- storage uses the INN API, mainly for CNFS. As the article structure is
  different, the code must be changed (and cleaned).
- inn.conf equivalent for all non-peer related parameters (innt.conf ?).
  As for newsfeeds.conf, reloads must be as smart as possible.
- no plan for a ctlinntd yet.
- autoconf
- IPv6
- innpeers-stat as log analyzer.

here is the essential. I'd like to hear your comments. If someone is
interested to work on the code before I release it, contact me.

-- 
Fabien Tassin -+- fta at sofaraway.org