json-rust icon indicating copy to clipboard operation
json-rust copied to clipboard

Small error when parsing long float value

Open Mark-Simulacrum opened this issue 9 years ago • 12 comments

JavaScript's JSON.parse doesn't lose precision from my tests.

Test case below:

#[test]
fn large_precision() {
    assert_eq!(parse("0.004750000000000001").unwrap().as_f64().unwrap(), 0.004750000000000001);
}

Mark-Simulacrum avatar Aug 01 '16 13:08 Mark-Simulacrum

If it helps, I have quite a few cases similar to this one. Let me know if they'd be helpful.

Mark-Simulacrum avatar Aug 01 '16 13:08 Mark-Simulacrum

This error comes from the fact that in floating point arithmetic, 4750000000000001f64 * 10e-19f64 is 0.004750000000000002. @maciejhirsz Serde uses 4750000000000001f64 / 10e19f64 because it is more accurate.

dtolnay avatar Aug 01 '16 14:08 dtolnay

Thanks for the tip @dtolnay.

@Mark-Simulacrum sure, I'd gladly include any edge cases in my tests.

maciejhirsz avatar Aug 01 '16 16:08 maciejhirsz

Warning: the files are large; so wget vs opening in browser is a good idea (Chrome, at least, was unable to do anything beneficial with the file: no display and lots of CPU/memory usage).

Some of these may be false (actually correct serialization), but this is a sort | uniq | diff -u of the output of the program that "found" this error: https://gist.githubusercontent.com/Mark-Simulacrum/280d5b4530ae68f958b3b3542bb613e1/raw/120c3ba89daa9a526225496673e5e19f396f0de3/rust-json.diff. I'm not sure how helpful this diff is, but it's a start.

Also, I've found that I'm not 100% sure that Serde's implementation is equivalent to the JS version either; here's the same type of file for the Serde impl: https://gist.githubusercontent.com/Mark-Simulacrum/05431fd1e1048ee1924879cdde2e0fb9/raw/462391ec7e17acb1ae4a32ae852bd45bbda4d2ab/serde.diff. Serde seems to have less errors, but there do appear to still be errors.

Mark-Simulacrum avatar Aug 01 '16 16:08 Mark-Simulacrum

Thanks for that, very helpful!

maciejhirsz avatar Aug 01 '16 19:08 maciejhirsz

@maciejhirsz I'm thinking we should either reopen this or I can move this into #85; still seeing errors with 0.10.1. Diff included (12M) this time I'm nearly 97% certain on the errors since I included more information in the output: https://gist.githubusercontent.com/Mark-Simulacrum/0a26fadc129dc99c441a3df0b42ae9d7/raw/18c0e5bd85939841446706614173cec6c2be51e5/erroring.diff

Mark-Simulacrum avatar Aug 01 '16 19:08 Mark-Simulacrum

I can also provide the raw JSON files (they're actually already on Github (see rust-lang-nursery/rustc-perf and the instructions there as to how to get the "data" directory) if that helps, although from the 1-2 values I've checked everything matches (JS processed the number correctly).

Mark-Simulacrum avatar Aug 01 '16 19:08 Mark-Simulacrum

@dtolnay I reran my serde_json version and diffed that with the latest diff link, and Serde now has exactly the same problems as json-rust. If you'd like I can file an issue with the diff/errors on serde_json's repo.

Mark-Simulacrum avatar Aug 01 '16 19:08 Mark-Simulacrum

@Mark-Simulacrum not sure if I'm reading this right, did the error rate went up with 0.10.1? The first diff is a fraction of this one.

maciejhirsz avatar Aug 01 '16 23:08 maciejhirsz

Well, the diff will be longer since this time I'm not feeding the results through uniq; that way I can be nearly 100% certain that the results aren't false positives. I've recomputed the results just in case: this is the non-unique version and this is the unique version.

Note that the unique version is much smaller than before (and what I gave you before was a unique version). However, the unique version shows that in most cases in my data set, when a precision error is encountered, it seems to be encountered with a few close-lying floats; so the diff is a little harder to interpret.

Mark-Simulacrum avatar Aug 01 '16 23:08 Mark-Simulacrum

I've come up with the below as a sample of what may be a good alternative to manual testing (at least for simple cases to generate). This quickly fails the floats test but passes both signed and unsigned integers (which makes sense).

extern crate json;

#[macro_use]
extern crate quickcheck;

use json::JsonValue;

quickcheck! {
    fn unsigned_integers(n: u64) -> bool {
        JsonValue::from(n).as_u64().unwrap() == n
    }

    fn signed_integers(n: i64) -> bool {
        JsonValue::from(n).as_i64().unwrap() == n
    }

    fn floats(n: f64) -> bool {
        JsonValue::from(n).as_f64().unwrap() == n
    }
}

Mark-Simulacrum avatar Aug 02 '16 03:08 Mark-Simulacrum

Thanks @Mark-Simulacrum, I filed https://github.com/serde-rs/json/issues/128 on our end.

dtolnay avatar Aug 02 '16 04:08 dtolnay