malformed overview! batches

Olaf Titz olaf at bigred.inka.de
Thu Jan 18 10:15:17 UTC 2001


>   however, the first line of the batch file is always corrupted.  it's
> missing about the
> first half of the data:

This is the backlogging channel feed problem I've been talking about
recently. The cause is roughly this: the channel feed is implemented
via a pipe or socketpair which has a fixed size buffer in the kernel.
(Usually 8192 or 32768 bytes.) When a channel gets lagging behind, the
pipe buffer fills up. The last few bytes to be squeezed into the
buffer with write() are most likely the first half of a line innd
wants to send to the channel. After that, the write() returns
first incomplete, then EAGAIN (or select() doesn't select for write
any more), and innd switches to spool file mode.

Unfortunately, it now starts out with the second half of the line it
was trying to write last. Result: the first line which goes into the
spool file is corrupted. I can remember having this problem with
overchan under INN 1.x very frequently; back then loss of an overview
line wasn't catastrophic.

The root cause is the lack of an atomic write() operation on pipes and
stream sockets under the usual Un*x API. What would be needed here is
a SEQPACKET socket, but most systems don't provide that. There is no
easy way around it on the application level; the best that can be done
is for every channel feed to read as fast as it can and do its own
buffering, so the kernel buffer won't ever fill up. innfeed does that,
and the code is rather involved. (c-nocem does it too, in perl even,
and the code is involved too.)

Perhaps a library function would be in order to do this buffering for
all standard, and preferrably also third-party, channel feeds. However
this could get complicated because it doesn't fit in the while(!eof)
{ read_line(); process_line(); } programming model. Basically every
channel feed would need its own select loop.

Olaf



More information about the inn-workers mailing list