Checkgroups and description update (problem)

Julien ÉLIE julien at trigofacile.com
Wed Nov 22 22:59:57 UTC 2006


En réponse à Julien ÉLIE :
> However, it works WITH AN ASCII newsgroups file.

Yes, so it can be used if we assume people do not change the default encoding of their
newsgroup file.

The only problem is for people who have another encoding.  For instance, utf-8.


> I for one use an utf-8 newsgroups file so hierarchies like « fr.* » or « cn.bbs.* »
> are not well handled by the modification I suggest.
> 
> Does somebody know how to deal with that issue?
> We could do a « iconv » conversion but I doubt whether every system has it installed :-/

For instance (but it is very ugly):

${EGREP} "${PATS}" ${NEWSGROUPS} | ${EGREP} "${1:-.}" | ${SED} 's/[	]\+/	/' |
     sort >${T}/$$localdesc

if (echo ${PATS} | ${EGREP} "\^cn\[" > /dev/null) ; then
     ${EGREP} "${PATS}" ${T}/$$msg | ${EGREP} "${1:-.}" | ${SED} 's/[	]\+/	/' |
         iconv -f gb2312 -t utf-8 | sort >${T}/$$newdesc
else
     ${EGREP} "${PATS}" ${T}/$$msg | ${EGREP} "${1:-.}" | ${SED} 's/[	]\+/	/' |
         iconv -f iso-8859-1 -t utf-8 | sort >${T}/$$newdesc
fi



Perhaps « utf-8 » can be a global variable (in « inn.conf », telling the encoding of the
newsgroups file?).
But after, I do not know whether the encoding « iso-8859-1 » will remain in the future
for checkgroups.  And I also have to check whether the hierarchy is Chinese (indeed, they
send checkgroups articles encoded in gb2312).

An idea for that?
How can encoding be easily handled?

But do we have to worry about that now? (since it works with ASCII newsgroups file)

-- 
Julien

« -- Mais je leur ai simplement demandé si c'était la bonne route !!!
   -- Ils nous ont fait une 'éponse de No'mand ! » (Astérix)



More information about the inn-patches mailing list