json11
json11 copied to clipboard
JSON output is locale-dependent and is not cross-platform.
json11 uses locale-dependent functions for conversion between double and it's string representation. On Ubuntu with Russian language it prints doubles with comma like '3,1415926'. On Windows with Russian language pack or any OS with English language it prints '3.1415926' and parser also expects '3.1415926' so '3,1415926' cannot be parsed.
Sergey
Thanks for pointing that out. I'm just a user myself, not any of the developers, and I hadn't thought about this issue so far. This is actually a problem for me too. I wonder if the json specification mandates how floating point numbers are to presented and whether json is cross platform by definition.
Peter
On Wed, Jul 29, 2015 at 6:26 AM, sergey-shambir [email protected] wrote:
json11 uses locale-dependent functions for conversion between double and it's string representation. On Ubuntu with Russian language it prints doubles with comma like '3,1415926'. On Windows with Russian language pack or any OS with English language it prints '3.1415926' and parser also expects '3.1415926' so '3,1415926' cannot be parsed.
— Reply to this email directly or view it on GitHub https://github.com/dropbox/json11/issues/38.
It does: http://www.json.org/
It is very clear that a period is supposed to be used for the decimal point if you look at the parsing diagrams.
This is definitely a bug. We likely don't have time to implement a proper fix, but I'd love to see a pull request. Options include:
- Use snprintf_l if available or emulate somehow if not
- Loop through the string after conversion and replace
localeconv()->decimal_point
with.
- Embed our own copy of
grisu2
(I think this is my preferred option)
json11
also have str->double bug, yes, in parse_number()
it checks that number have "." as delimiter, but after that it uses std::strtod
, and this function depend on locale, so 10.5
it can convert to 10.0
, because of it expects another delimiter.
What about import code from rapid json
, it has very fast string<->double routies: https://github.com/miloyip/rapidjson/tree/master/include/rapidjson/internal
and the code is under MIT license?
Also there is https://github.com/google/double-conversion from author of girus3
(BSD like license), what about use it as submodule?
I convert json11 to usage of double-conversion (for string->double and double->string): https://github.com/Dushistov/json11/commit/115fc7609ddd8bb2661ea0dd5ce54a5b347c0c62
nativejson-benchmark from here https://github.com/miloyip/nativejson-benchmark give:
Before:
Parse Time (ms) 51
Stringify Time (ms) 109
After: Parse Time (ms) 41 Stringify Time (ms) 23
As you can see speedup 24% for parsing, and 374% for dump, plus it not depend on locale.
What do you think?
I like the idea of fixing this, and like the sound of those benchmarks. A submodule seems quite heavy weight for json11, which is currently a very tiny library which is easy to integrate. Dropbox's own build environment wouldn't have any trouble with a submodule, but it's a lot of overhead for other users. Is there a simpler option?
I'll get an answer on whether the license is okay, but I suspect it would be.
Bump. In my opinion this is a pretty heavy bug. I'm happy I found this discussion after googleing the right words. We used json11 in a project for a while and I just started to use it for the first double-values. After parsing there values where all truncated ints (which resulted in completely deformed 3D-Objects in a 3D Visualization). Can't think of an elegant way to fix this in my application: Replace "." by "," is not possible because commas have a different meaning in json. Using quotation would result in a string. We have to use another json library at the moment. If all our systems/test-systems where configured to English locale, we would not have found the bug and shipped a faulty application (which is not your fault of cause, thanks for the great work with json11 :) ). But this bug should at least be advertised a bit more.
Pull requests welcome.
@artwyman
@j4cbo suggested the followings:
-
snprintf_l
if available or emulate somehow if not - Loop through the string after conversion and replace
localeconv()->decimal_point
with.
- Embed our own copy of grisu2 (I think this is my preferred option)
Are all good for you? Probably point 2 would be a quick (though pretty dirty) solution while waiting for someone to implement or port grisu2.
I don't have time to go deep, but they all sound plausible. My primary concerns on any new dependencies would be how portable they are, particularly to the mobile platforms where Dropbox uses json11. For something hand-tweaked like #2 I'd just want to make sure it has minimal risk for unintended impact (i.e. don't accidentally mess with commas in strings) and minimal performance impact.
Patch for json.cpp
diff --git a/json11.cpp b/json11.cpp
index 9647846..73d76e8 100644
--- a/json11.cpp
+++ b/json11.cpp
@@ -24,7 +24,9 @@
#include <cmath>
#include <cstdlib>
#include <cstdio>
+#include <iomanip>
#include <limits>
+#include <sstream>
namespace json11 {
@@ -56,9 +58,11 @@ static void dump(NullStruct, string &out) {
static void dump(double value, string &out) {
if (std::isfinite(value)) {
- char buf[32];
- snprintf(buf, sizeof buf, "%.17g", value);
- out += buf;
+ std::stringstream ss;
+ ss.imbue(std::locale::classic());
+ ss.precision(std::numeric_limits<double>::digits10);
+ ss << value;
+ out += ss.str();
} else {
out += "null";
}
@@ -618,7 +622,13 @@ struct JsonParser final {
i++;
}
- return std::strtod(str.c_str() + start_pos, nullptr);
+ std::stringstream ss(str.c_str() + start_pos);
+ ss.imbue(std::locale::classic());
+ ss.precision(std::numeric_limits<double>::digits10);
+ double value = 0;
+ ss >> value;
+
+ return value;
}
/* expect(str, res)
To reproduce this in tests, locale is need to be set at the start of testcase (commented out in patch).
Patch for test.cpp
diff --git a/test.cpp b/test.cpp
index ab2e2b9..cf26b34 100644
--- a/test.cpp
+++ b/test.cpp
@@ -60,13 +60,16 @@ CHECK_TRAIT(is_nothrow_move_assignable<Json>);
CHECK_TRAIT(is_nothrow_destructible<Json>);
JSON11_TEST_CASE(json11_test) {
+ // setlocale(LC_ALL, "ru_RU.utf8");
+
const string simple_test =
- R"({"k1":"v1", "k2":42, "k3":["a",123,true,false,null]})";
+ R"({"k1":"v1", "k2":42.31415, "k3":["a",123,true,false,null]})";
string err;
const auto json = Json::parse(simple_test, err);
std::cout << "k1: " << json["k1"].string_value() << "\n";
+ std::cout << "k2: " << json["k2"].number_value() << "\n";
std::cout << "k3: " << json["k3"].dump() << "\n";
for (auto &k : json["k3"].array_items()) {
@@ -158,12 +161,12 @@ JSON11_TEST_CASE(json11_test) {
// Json literals
const Json obj = Json::object({
{ "k1", "v1" },
- { "k2", 42.0 },
+ { "k2", 42.31415 },
{ "k3", Json::array({ "a", 123.0, true, false, nullptr }) },
});
std::cout << "obj: " << obj.dump() << "\n";
- JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42, \"k3\": [\"a\", 123, true, false, null]}");
+ JSON11_TEST_ASSERT(obj.dump() == "{\"k1\": \"v1\", \"k2\": 42.31415, \"k3\": [\"a\", 123, true, false, null]}");
JSON11_TEST_ASSERT(Json("a").number_value() == 0);
JSON11_TEST_ASSERT(Json("a").string_value() == "a");
I think it's in the spirit of json11, but whatever, hope this helps.
Thanks for taking a stab at it. Please phrase your submissions in the form of a Pull Request, though, not a patch.
On a quick look, I wonder about the potential performance difference between stringstream and snprintf, and whether that could add up when printing a lot of numbers. I wouldn't know for sure if that's an issue without measuring, though.
C++17 has std::from_chars() and std::to_chars() specifically for this kind of usage (JSON, etc...). They're also supposed to be very fast (faster than the usual locale-aware functions). An ifdef to enable these for people using C++17 would be very nice.
http://en.cppreference.com/w/cpp/utility/from_chars http://en.cppreference.com/w/cpp/utility/to_chars
Nah, i'm good. I can only note that it's too trivial to be copyrighted, so anyone are welcome to do whatever they want with those patches.