json11 icon indicating copy to clipboard operation
json11 copied to clipboard

JSON output is locale-dependent and is not cross-platform.

Open sergey-shambir opened this issue 9 years ago • 16 comments

json11 uses locale-dependent functions for conversion between double and it's string representation. On Ubuntu with Russian language it prints doubles with comma like '3,1415926'. On Windows with Russian language pack or any OS with English language it prints '3.1415926' and parser also expects '3.1415926' so '3,1415926' cannot be parsed.

sergey-shambir avatar Jul 29 '15 05:07 sergey-shambir

Sergey

Thanks for pointing that out. I'm just a user myself, not any of the developers, and I hadn't thought about this issue so far. This is actually a problem for me too. I wonder if the json specification mandates how floating point numbers are to presented and whether json is cross platform by definition.

Peter

On Wed, Jul 29, 2015 at 6:26 AM, sergey-shambir [email protected] wrote:

json11 uses locale-dependent functions for conversion between double and it's string representation. On Ubuntu with Russian language it prints doubles with comma like '3,1415926'. On Windows with Russian language pack or any OS with English language it prints '3.1415926' and parser also expects '3.1415926' so '3,1415926' cannot be parsed.

— Reply to this email directly or view it on GitHub https://github.com/dropbox/json11/issues/38.

peterritter avatar Jul 29 '15 07:07 peterritter

It does: http://www.json.org/

It is very clear that a period is supposed to be used for the decimal point if you look at the parsing diagrams.

crazy2be avatar Oct 15 '15 01:10 crazy2be

This is definitely a bug. We likely don't have time to implement a proper fix, but I'd love to see a pull request. Options include:

  • Use snprintf_l if available or emulate somehow if not
  • Loop through the string after conversion and replace localeconv()->decimal_point with .
  • Embed our own copy of grisu2 (I think this is my preferred option)

j4cbo avatar Oct 15 '15 22:10 j4cbo

json11 also have str->double bug, yes, in parse_number() it checks that number have "." as delimiter, but after that it uses std::strtod, and this function depend on locale, so 10.5 it can convert to 10.0, because of it expects another delimiter.

Dushistov avatar Apr 27 '16 00:04 Dushistov

What about import code from rapid json, it has very fast string<->double routies: https://github.com/miloyip/rapidjson/tree/master/include/rapidjson/internal

and the code is under MIT license?

Dushistov avatar Apr 27 '16 00:04 Dushistov

Also there is https://github.com/google/double-conversion from author of girus3 (BSD like license), what about use it as submodule?

Dushistov avatar Apr 27 '16 00:04 Dushistov

I convert json11 to usage of double-conversion (for string->double and double->string): https://github.com/Dushistov/json11/commit/115fc7609ddd8bb2661ea0dd5ce54a5b347c0c62

nativejson-benchmark from here https://github.com/miloyip/nativejson-benchmark give:

Before: Parse Time (ms) 51
Stringify Time (ms) 109

After: Parse Time (ms) 41 Stringify Time (ms) 23

As you can see speedup 24% for parsing, and 374% for dump, plus it not depend on locale.

What do you think?

Dushistov avatar Apr 27 '16 02:04 Dushistov

I like the idea of fixing this, and like the sound of those benchmarks. A submodule seems quite heavy weight for json11, which is currently a very tiny library which is easy to integrate. Dropbox's own build environment wouldn't have any trouble with a submodule, but it's a lot of overhead for other users. Is there a simpler option?

I'll get an answer on whether the license is okay, but I suspect it would be.

artwyman avatar May 18 '16 22:05 artwyman

Bump. In my opinion this is a pretty heavy bug. I'm happy I found this discussion after googleing the right words. We used json11 in a project for a while and I just started to use it for the first double-values. After parsing there values where all truncated ints (which resulted in completely deformed 3D-Objects in a 3D Visualization). Can't think of an elegant way to fix this in my application: Replace "." by "," is not possible because commas have a different meaning in json. Using quotation would result in a string. We have to use another json library at the moment. If all our systems/test-systems where configured to English locale, we would not have found the bug and shipped a faulty application (which is not your fault of cause, thanks for the great work with json11 :) ). But this bug should at least be advertised a bit more.

dabulla avatar May 10 '17 11:05 dabulla

Pull requests welcome.

artwyman avatar May 12 '17 03:05 artwyman

@artwyman

@j4cbo suggested the followings:

  • snprintf_l if available or emulate somehow if not
  • Loop through the string after conversion and replace localeconv()->decimal_point with .
  • Embed our own copy of grisu2 (I think this is my preferred option)

Are all good for you? Probably point 2 would be a quick (though pretty dirty) solution while waiting for someone to implement or port grisu2.

skypjack avatar May 12 '17 07:05 skypjack

I don't have time to go deep, but they all sound plausible. My primary concerns on any new dependencies would be how portable they are, particularly to the mobile platforms where Dropbox uses json11. For something hand-tweaked like #2 I'd just want to make sure it has minimal risk for unintended impact (i.e. don't accidentally mess with commas in strings) and minimal performance impact.

artwyman avatar May 13 '17 04:05 artwyman

Patch for json.cpp

diff --git a/json11.cpp b/json11.cpp
index 9647846..73d76e8 100644
--- a/json11.cpp
+++ b/json11.cpp
@@ -24,7 +24,9 @@
 #include <cmath>
 #include <cstdlib>
 #include <cstdio>
+#include <iomanip>
 #include <limits>
+#include <sstream>
 
 namespace json11 {
 
@@ -56,9 +58,11 @@ static void dump(NullStruct, string &out) {
 
 static void dump(double value, string &out) {
     if (std::isfinite(value)) {
-        char buf[32];
-        snprintf(buf, sizeof buf, "%.17g", value);
-        out += buf;
+        std::stringstream ss;
+        ss.imbue(std::locale::classic());
+        ss.precision(std::numeric_limits<double>::digits10);
+        ss << value;
+        out += ss.str();
     } else {
         out += "null";
     }
@@ -618,7 +622,13 @@ struct JsonParser final {
                 i++;
         }
 
-        return std::strtod(str.c_str() + start_pos, nullptr);
+        std::stringstream ss(str.c_str() + start_pos);
+        ss.imbue(std::locale::classic());
+        ss.precision(std::numeric_limits<double>::digits10);
+        double value = 0;
+        ss >> value;
+
+        return value;
     }
 
     /* expect(str, res)

To reproduce this in tests, locale is need to be set at the start of testcase (commented out in patch).

Patch for test.cpp

diff --git a/test.cpp b/test.cpp
index ab2e2b9..cf26b34 100644
--- a/test.cpp
+++ b/test.cpp
@@ -60,13 +60,16 @@ CHECK_TRAIT(is_nothrow_move_assignable<Json>);
 CHECK_TRAIT(is_nothrow_destructible<Json>);
 
 JSON11_TEST_CASE(json11_test) {
+    // setlocale(LC_ALL, "ru_RU.utf8");
+
     const string simple_test =
-        R"({"k1":"v1", "k2":42, "k3":["a",123,true,false,null]})";
+        R"({"k1":"v1", "k2":42.31415, "k3":["a",123,true,false,null]})";
 
     string err;
     const auto json = Json::parse(simple_test, err);
 
     std::cout << "k1: " << json["k1"].string_value() << "\n";
+    std::cout << "k2: " << json["k2"].number_value() << "\n";
     std::cout << "k3: " << json["k3"].dump() << "\n";
 
     for (auto &k : json["k3"].array_items()) {
@@ -158,12 +161,12 @@ JSON11_TEST_CASE(json11_test) {
     // Json literals
     const Json obj = Json::object({
         { "k1", "v1" },
-        { "k2", 42.0 },
+        { "k2", 42.31415 },
         { "k3", Json::array({ "a", 123.0, true, false, nullptr }) },
     });
 
     std::cout << "obj: " << obj.dump() << "\n";
-    JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42, \"k3\": [\"a\", 123, true, false, null]}");
+    JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42.31415, \"k3\": [\"a\", 123, true, false, null]}");
 
     JSON11_TEST_ASSERT(Json("a").number_value() == 0);
     JSON11_TEST_ASSERT(Json("a").string_value() == "a");

I think it's in the spirit of json11, but whatever, hope this helps.

alekseyt avatar Jan 12 '18 16:01 alekseyt

Thanks for taking a stab at it. Please phrase your submissions in the form of a Pull Request, though, not a patch.

On a quick look, I wonder about the potential performance difference between stringstream and snprintf, and whether that could add up when printing a lot of numbers. I wouldn't know for sure if that's an issue without measuring, though.

artwyman avatar Jan 13 '18 03:01 artwyman

C++17 has std::from_chars() and std::to_chars() specifically for this kind of usage (JSON, etc...). They're also supposed to be very fast (faster than the usual locale-aware functions). An ifdef to enable these for people using C++17 would be very nice.

http://en.cppreference.com/w/cpp/utility/from_chars http://en.cppreference.com/w/cpp/utility/to_chars

ashaduri avatar Jan 13 '18 12:01 ashaduri

Nah, i'm good. I can only note that it's too trivial to be copyrighted, so anyone are welcome to do whatever they want with those patches.

alekseyt avatar Jan 13 '18 13:01 alekseyt