jansson icon indicating copy to clipboard operation
jansson copied to clipboard

Unicode Path's Issue

Open elTRexx opened this issue 4 years ago • 7 comments

Hi, First sorry if I misbehave here by reporting an issue originally from another software using your library in back end.

So I got an issue with HandBrake software to load my preferences stored in a json file. Posted an issue over there, and the dev report that the issue is due to the fact that I have an UNICODE username, and so my user Windows directory path isn't handled by the json library they use, namely jansson lib. : https://github.com/HandBrake/HandBrake/issues/2427 They post a workaround, but but no fix from them, it is on you to fix your lib they said.

Thanks.

elTRexx avatar Apr 21 '20 19:04 elTRexx

json_load_file() takes const char *path and passes it directly to fopen():

https://github.com/akheron/jansson/blob/v2.12/src/load.c#L1074

On a linux or mac system, these file path C strings would be encoded in utf8. Windows has always been different in these cases ... it has other file APIs that take utf16-le, but that's not really compatible with C strings. On windows fopen() considers the filename to be in the ANSI codepage by default (or the OEM codepage if the windows application configured this globally). But there is a wacky mode string format to enable utf8: "rb,ccs=UTF-8":

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?view=vs-2019#unicode-support

So jansson could do that when compiled on Windows ... but is HandBrake encoding the path string as utf-8 on windows?

ploxiln avatar Apr 21 '20 22:04 ploxiln

correction: the mode "ccs=UTF-8" is for transforming the file contents, it does not apply to the path argument, nevermind that idea

ploxiln avatar Apr 22 '20 04:04 ploxiln

Hi, So I look on this, and what I found is, in theory, usign _wfopen() instead of fopen(). https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?redirectedfrom=MSDN&view=vs-2019

I could test it (never compiled myself a GitHub project before but I can try). But upfront is usign this official alternative function instead of legacy fopen() sound ok to you or ring some alarm to you ?

Thx

Mr Maignan A.

Note : I’m french so my English isn’t perfect.

De : Pierce Lopezmailto:[email protected] Envoyé le :mercredi 22 avril 2020 06:02 À : akheron/janssonmailto:[email protected] Cc : elTRexxmailto:[email protected]; Authormailto:[email protected] Objet :Re: [akheron/jansson] Unicode Path's Issue (#531)

correction: the mode "ccs=UTF-8" is for transforming the file contents, it does not apply to the path argument, nevermind that idea

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/akheron/jansson/issues/531#issuecomment-617535581, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIU7PAECRLCWNFC3ZUEMJCTRNZT4XANCNFSM4MNRSUWA.

elTRexx avatar Apr 22 '20 10:04 elTRexx

Handbrake could open the file with unicode name themselves using _wfopen and call json_loadf on the FILE* handle. I'd rather not add Windows specific file opening stuff to Jansson.

akheron avatar Apr 22 '20 12:04 akheron

This seems like a reasonable documentation improvement to specify a known limitation for Windows. We could advise json_loadf with an already open FILE* as the method with best compatibility with non-ASCII filenames.

coreyfarrell avatar Apr 22 '20 13:04 coreyfarrell

Looks like they already fixed it that way: https://github.com/HandBrake/HandBrake/commit/cd252068e94ea1edff177e5abe93c38c5eb6a037

(hb_fopen() uses _wfopen_s() on windows, and converts path to utf16 for it)

ploxiln avatar Apr 22 '20 15:04 ploxiln

Thanks !

I was battling against cygwin and stuff to try compile handbrake myself once I understood HandBrake get their 3rd party Library, like yours, on compile time ! (fetching them online from repo, compile them all in a single dll, the « core » of handbrake if I am right (well after very few hours looking on sources)). So yeah forking my own jansson lib was useless here as they get their own jansson lib online.

And compiling HandBrake on Windows need … Linux ! That I don’t have, so was currently trying/texting to do the compile using cygwin, and then your mail !

So yeah, you pointed out exactly where they already used _wfopen() (actually _wfopen_s(), I supose a variant of previous one) and they could reuse this code to handle UNICODE Path.

So I see you contacted HandBrake dev, and help them out with my concern ! I hope taking the liberty to contact you directly will not be perceived bad from them, or your team. If so, then I Apologize.

So for now, continuing trying to compile and build HandBrake with this new modification you suggested to them ! (and I still don’t understand why a crossplatform software like handbrake, need a linux OS to perform compilation …)

Mr Maignan A.

De : Pierce Lopezmailto:[email protected] Envoyé le :mercredi 22 avril 2020 17:52 À : akheron/janssonmailto:[email protected] Cc : elTRexxmailto:[email protected]; Authormailto:[email protected] Objet :Re: [akheron/jansson] Unicode Path's Issue (#531)

Looks like they already fixed it that way: HandBrake/HandBrake@cd25206https://github.com/HandBrake/HandBrake/commit/cd252068e94ea1edff177e5abe93c38c5eb6a037

(hb_fopen() uses _wfopen_s() on windows, and converts path to utf16 for it)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/akheron/jansson/issues/531#issuecomment-617864930, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIU7PAET3UQ4RMXQHKJ4TVLRN4HFJANCNFSM4MNRSUWA.

elTRexx avatar Apr 22 '20 17:04 elTRexx