Henri Sivonen
Henri Sivonen
That seems relevant in terms of identifying an _encoder_ that can produce byte sequences below the HKSCS area.
Some findings about `WideCharToMultiByte` with flags set to zero (i.e. "best fit" _not_ forbidden): 950/Big5: The U+FFFD cells that can be seen in the [Big5 visualization](https://encoding.spec.whatwg.org/big5.html) are filled with PUA...
Random thought: eudcedit.exe offers to start from the start of the Unicode Private Use Area. In case of Shift_JIS and gbk, the Encoding Standard is well compatible with this. (Let's...
(This is not yet a suggestion to make our EUC-KR PUA-consistent with 949. At present I'm just trying to understand why things are the way they are.)
It [seems](https://docs.microsoft.com/en-us/windows/desktop/intl/eudccoderange) that not all PUA mappings in Windows legacy code pages are strictly considered part of EUDC.
moztw.org maintains a [repository of information about Big5](https://translate.google.com/translate?sl=zh-CN&tl=en&u=http%3A%2F%2Fmoztw.org%2Fdocs%2Fbig5%2F). Checking the UAO tables there, it's worth noting that [UAO had mappings](http://moztw.org/docs/big5/table/uao250-b2u.txt) for the byte pairs whose lead is in the range...
> I'm against any more tweaking in general The more I've examined the issue reported here, the more convinced I am that * The browser-side security mitigations for legacy Java...
> Since JIS X 0208 has undergone less extension and the extensions have happened ages ago, I think we _probably_ don't need to change Shift_JIS analogously for trails that are...
I think we should do this at least for Big5, because it would better protect against Java and Windows (kernel32.dll) -based generators getting XSSed. @achristensen07, do you have an opinion?
Note that UTC disagreed with the Encoding Standard on this point: https://www.unicode.org/L2/L2019/19192-review-docs.pdf I.e. making the requested change here would align the UTC position.