This project has moved and is read-only. For the latest updates, please go here.


Improve handling of invalid URI characters


Currently, characters with a character code >= 128 are not recognized as illegal characters and are not hex-encoded. However, legal characters should be from the US-ASCII set, which just covers characters 0-127. The proper encoding is to convert other characters is to convert to UTF-8 encoding first and then to hex-encode the individual bytes one by one.

The encoding error becomes apparent when converting files with German Umlauts (äöüÄÖÜ).

I am proposing the attached patch for the StringUtils.cpp file.

file attachments

Closed Dec 23, 2016 at 8:17 PM by clechasseur
Sorry for the long delay, but it seems you retracted it before I could take a look at it. I will thus close this.


Tauris wrote Nov 23, 2016 at 3:53 PM

I would like to retract this patch proposal. I have since learned that it would create problems with Internet Explorer as described here:
Internet Explorer would not convert back to UTF-8.