Thoughts on new overview
Russ Allbery
rra at stanford.edu
Fri Sep 6 02:31:46 UTC 2002
Jeffrey M Vinocur <jeff at litech.org> writes:
> I've been playing around with the overview rebuilding just now, and did
> find something worth mentioning, I guess. I took the overview built by
> INN as the articles appeared, and the results of rebuilding a particular
> group, and diffed them. The bytecounts of all of the articles have
> increased. For example (with newlines for readability):
Hm. Looking at the innd algorithm, it only counts \r\n as a single
character for the byte count, so the line count will change by the number
of lines in the message plus the number of header lines in the message.
It also doesn't count the blank line between the headers and the body or
the trailing .\r\n.
If you look at the headers of those articles too, does that count for the
differences? A quick check of those articles here seems to indicate that
it does.
I think that what tdx-util does is more correct per the proposed RFC 977
replacement than what innd is doing, although we still haven't hashed out
what the byte count is actually supposed to be. In any event, it just
stats the file and uses the file size as the byte count; I did that
because it was very simple and easy.
--
Russ Allbery (rra at stanford.edu) <http://www.eyrie.org/~eagle/>
Please send questions to the list rather than mailing me directly.
<http://www.eyrie.org/~eagle/faqs/questions.html> explains why.
More information about the inn-workers
mailing list