jeff at litech.org
Sun Apr 10 17:19:05 UTC 2005
On Apr 8, 2005, at 12:29 PM, Clive D.W. Feather wrote:
> First draft. Comments welcome.
I like it.
> countries, newsgroup hierarchies, and individuals
> have all found different solutions which
have found a variety of solutions that
> work for them but are not
are satisfactory (?)
are adequate (?)
> With the increased use of MIME in
> email, it is becoming more common to find MIME headers identifying
> the character set of the body, but this is far from universal.
I think the mention of email makes it unclear that "body" indicates an
NNTP article. Perhaps "...it is becoming more common for NNTP articles
to include MIME headers..."?
> One point that has been generally accepted is that articles can
> contain octets with the top bit set, and NNTP is only expected to
> operate on 8-bit clean transport paths.
Potentially you need to mention NUL and bare CR/LF here?
> and not gratuitously break existing implementations and
> arrangements, even if they are less than optimal.
This feels a little wordy. Something like "and not needlessly break
existing functional but suboptimal implementations and arrangements?"
> The NNTP itself is extended from US-ASCII [ANSI1986] to UTF-8
> [RFC3629] in this specification. Except in the specific areas
> discussed below, UTF-8 (which is a superset of ASCII) is mandatory
> and implementations MUST NOT use any other encoding.
> The major deviation from this requirement
> some header values (and, of course, the article body) are generated
> by users using software which adopts local practices; for example,
> may encode all text is in ISO 8859-1 without including a MIME header
> to that effect.
I had trouble determining the referent of "it" in this sentence,
perhaps substituting "a client" would clear it up?
(Incidentally, here I see another "which" in a non-restrictive clause.
Perhaps you don't believe in that grammar rule and I should stop
pointing it out?)
> More specifically, while implementations
> SHOULD only allow the creation of new articles where the headers
> conform to UTF-8, where an article is obtained from an external
> source an implementation MAY pass it on, and derive data from it
> (such as the response to the HDR command), even though the article
> the data is not valid UTF-8.
This should be broken into two sentences for clarity. Suggest:
More specifically, implementations SHOULD only allow the creation
of new articles where the headers conform to UTF-8. However, when
an article is obtained from an external source, an implementation
MAY pass it on, and derive data from it (such as the response to
the HDR command), even though the article or the derived data may
not be valid UTF-8.
> Implementations MUST transfer such articles and data correctly.
What does "correctly" mean here?
> The second area of deviation is
I guess if you like "exception" for "deviation" above, it should be
changed here too.
> Restricting newsgroup names to UTF-8 is not a complete solution to
> the issues, of course. In particular, when new newsgroup names are
> created or a user is asked to enter a newsgroup name, some form of
> canonicalisation will need to take place.
Probably a little more text about canonicalization would be useful here.
Jeffrey M. Vinocur
jeff at litech.org
More information about the inn-workers