lua-cjson icon indicating copy to clipboard operation
lua-cjson copied to clipboard

Writing floats is locale dependent

Open smurfix opened this issue 3 years ago • 6 comments

Hi, CJSON is not locale independent. Specifically, if the current locale wants us to encode the decimal point as a comma, as in Germany, CJSON will happily do so.

This makes the other side rather unhappy. Worse, if the problem occurs in a list it's not even recognized by the receiving parser.

Please fix!

smurfix avatar Aug 31 '20 14:08 smurfix

Afaik, this bug was fixed in 1.0.4.

Can you please provide some more details about the platform you are using, and how you are compiling Lua CJSON?

There are 2 ways of compiling floating point support:

  • Adapting the system strtod/snprintf routines to use "." with any locale (the default)
  • Using internal dtoa/g_fmt routines to avoid any locale issues and increase performance. (USE_INTERNAL_FPCONV).

This is documented in the manual under Built-in floating point conversion.

mpx avatar Sep 01 '20 09:09 mpx

I'm using the Debian package for 2.1 which doesn't use specific preprocessor arguments. Apparently the program I'm linking with manages to change the current locale after your fpconv_init gets called.

I'll ask the Debian maintainer to build with USE_INTERNAL_FPCONV, but cjson still needs to support usage with a non-constant locale setting IMHO.

smurfix avatar Sep 02 '20 10:09 smurfix

Also, what about locales that don't even use "our" digits? Please consider unconditionally switching to USE_INTERNAL_FPCONV and/or using that as the default.

smurfix avatar Sep 02 '20 10:09 smurfix

While the hack to detect/adapt the decimal separator for different locales is ugly, I'm not aware of any specific locales where it doesn't work (happy to hear otherwise).

Changing the locale after the program has started is typically a bad strategy. setlocale isn't thread-safe, and various code might make assumptions based on the previous configuration.

I agree it would be better to have a single implementation. However, there are tradeoffs:

  • Platform routines
    • Pro: Work everywhere
    • Con: Slower, breaks if locale changes to a different decimal separator mid-execution (ideally that should never happen).
  • Internal dtoa routines
    • Pro: Faster
    • Con: Needs to be correctly built for little/big endian, and with/without Pthreads.

It's an awkward tradeoff. I wanted Lua CJSON to just work out of the box, which means using the platform routines. The internal dtoa has advantages but must be build correctly for the platform and the codebase.

If Debian wants to use the internal routines it will need to correctly set the endianness, and probably always enable/link pthreads (since it may be used in a multi-threaded application).

mpx avatar Sep 02 '20 13:09 mpx

Another option: Can you call fpconv_init after the locale has been changed? fpconv_init isn't thread safe, but neither is setlocale, so it shouldn't be any worse.

mpx avatar Sep 03 '20 08:09 mpx

Another option..

You could use compiler defines to autodetect the correct endianness for USE_INTERNAL_FPCONV by adapting something like this: https://stackoverflow.com/a/58642785

It might be reasonable to add something similar to CJSON as well, but it would need testing across a wide range of compilers/platforms first.

mpx avatar Jan 23 '21 08:01 mpx