MIME Distribution: headers
Russ Allbery
rra at stanford.edu
Thu Jan 1 00:39:07 UTC 2009
Julien ÉLIE <julien at trigofacile.com> writes:
> The Distribution: header is not necessarily in US-ASCII characters.
>
> Though USEFOR states:
>
> dist-name = ALPHA / DIGIT
> *( ALPHA / DIGIT / "+" / "-" / "_" )
>
> RFC 3977 states:
>
> distribution = token
> token = 1*P-CHAR
> P-CHAR = A-CHAR / UTF8-non-ascii
>
> As we already did in Makefile.global:
>
> ## If you modify these two strings, you must encode them in UTF-8
> ## (using only US-ASCII characters is consequently also fine) and
> ## keep their length reasonable; otherwise, your news server will not
> ## be complying with the NNTP protocol.
>
> VERSION = 2.5.0
> VERSION_EXTRA = prerelease
>
> we probably should do the same with the distrib.pats file and its man page.
Yup, sounds good.
> However, my concern is with newsfeeds. It handles distributions. Does
> it mean we have to do some sort of MIME decoding in order to implement
> RFC 3977?
No, I don't think so, at least unless we decide to do so with all
configuration files. The easy approach, and the reasonable one for now, I
think, is to require that people who put non-ASCII distributions in
newsfeeds use UTF-8. (This is probably also worth a comment.) Then the
existing code, which does byte string comparisons, should just work.
> Another question: with the active.times file, I do not know what is the
> best we can do in order to write the newsgroup creator's name in
> UTF-8... I think that only ctlinnd matters for that (mod-active and
> controlchan write "usenet" or something like that -- I have not
> checked). Is putting a warning in the man page of ctlinnd enough? The
> encoding depends on the one of the shell used!
Yeah, that one is hard. I'm not sure there's any really good solution
there other than a warning... I guess the other option would be to check
the string we're about to write to be sure it's correctly formed UTF-8,
and if it isn't, fail with an error instead of creating the group.
We probably need a general function to check for correctly formed UTF-8
anyway.
> Dealing with encodings is not easy at all!
Indeed. It's extremely tricky.
--
Russ Allbery (rra at stanford.edu) <http://www.eyrie.org/~eagle/>
Please send questions to the list rather than mailing me directly.
<http://www.eyrie.org/~eagle/faqs/questions.html> explains why.
More information about the inn-workers
mailing list