Russ Allbery rra at stanford.edu
Sat Aug 22 20:00:02 UTC 2009

Julien ÉLIE <julien at trigofacile.com> writes:

> Does someone know a good UTF-8 checker to validate a string?

The utf8_length function in lib/uwildmat.c is close and could probably be
turned into that without a whole lot of difficulty.  It currently allows
non-UTF-8 characters and just returns a length of 1 for them, but it
wouldn't take much more checking to fix that.

> For instance, control characters like \x08 are accepted by this
> implementation <http://snowplow.org/martin/utf8checker/> (though
> released under Apache License v2, but as it is said in the README file,
> we could ask for GPL v2 to match what we have most in INN).

We can't incorporate GPL v2 code in the main parts of INN since it then
conflicts with the OpenSSL license and distributions couldn't distribute
binaries built with OpenSSL.  GPL v2 is fine for stand-alone binaries and
scripts, but not for the library.

