Rob Butler crodster2k at
Tue Dec 5 05:13:46 UTC 2006



I've been looking into IDNA a bit lately and unfortunately I think most IDNA libraries may have a bug.  I'm looking for some feedback on this to see if I am wrong or if the libraries do in fact have a bug.  The primary reason I question whether this is a bug or not is it seems everyone has the same bug.

IDNA encoding is performed on a label by label basis.  First the domain name must be broken down into labels and then each label encoded in turn.  Most IDNA libraries don't seem to account for dots embedded within a label.

For example assuming "test\" were actually a name that needed IDNA encoding to be valid ASCII it should be broken down into the labels "test\.me", "example", "com" and the root label, right?  But from my review of at least 3 different IDNA libraries it appears they would break this down improperly to "test\", "me", "example", "com" and the root label.  Isn't that a bug?  Wouldn't the IDNA ASCII name be invalid as a result?

If it is a bug, should it be fixed or is it a matter of everyone made the same mistake so don't fix it because at least we are all compatible?  What does this mean then for domains with dots within a label?

Thanks - your feedback is greatly appreciated.

