framework icon indicating copy to clipboard operation
framework copied to clipboard

Large numbers in JSON will not be represented correctly in JavaScript

Open fulldecent opened this issue 5 years ago • 9 comments

Large integers, like 9999999999999999999999999 are valid in JSON files and are valid as data parts in the 0xcert conventions. However these numbers are currently not representable in JavaScript in full fidelity. This violates some guarantees made by (or assumptions you might make while using) the 0xcert Framework. Workarounds are available.

Discussion

These examples are running Safari Version 12.0.3 (14606.4.5).

Large numbers cannot currently be represented as integer literals.

9999999999999999999999999 == 9999999999999999999999998
// Outputs: true

However, they can be represented in JSON on your server or some other environment. If we find two such file, we can compare the JSON string.

a = '{"tokenId": 9999999999999999999999999}';
b = '{"tokenId": 9999999999999999999999998}';
a == b;
// Outputs: false

But if you decode those objects and compare...

a = '{"tokenId": 9999999999999999999999999}';
b = '{"tokenId": 9999999999999999999999998}';
JSON.stringify(JSON.parse(a)) == JSON.stringify(JSON.parse(b))
// Outputs: true

The 0xcert Framework currently uses native JSON parsing. This means...

:warning: If 0xcert Framework is acting on a JSON that includes integers larger than the JavaScript safe limit then you will not have access to the full fidelity of those integers.

And also...

:warning: If 0xcert Framework is testing/exposing unsafe large integers then the test/proof results might be inaccurate.

Workaround

If you will use large integers encoded as strings in your JSON then this does not affect you.

Example:

{
  "tokenId": "4045927345087450934570349857"
}

Also, if you will use the same environment running 0xcert Framework to create JSON files as is used to consume the files, then this does not affect you. Or in other words, in this case it is a JavaScript problem and you are programming in JavaScript when sending data to 0xcert Framework, therefore you should already know that you cannot send the number 9999999999999999999999999 as an integer to the 0xcert Framework with full fidelity.

Work plan

Steps for now

  • [ ] Document this known issue in the relevant places
    • [ ] Conventions documentation
    • [ ] Certification module documentation
    • [ ] Conceptual documentation
    • [ ] Release notes

Long-term plan

  • [ ] Upgrade to safer underlying JavaScript
    • [ ] Track solutions (BigNum, TypeScript polyfill?) and their deployment status
    • [ ] Require newer version of JavaScript for this Framework, or require the use of Babel or similar crosscospiler

-or-

  • [ ] Implement safe JSON parse function (possibly a separate NPM)
  • [ ] Require this safe JSON parser

References

  • 0xcert Framework issue #336
  • https://github.com/douglascrockford/JSON-js/issues/107#

fulldecent avatar Mar 13 '19 21:03 fulldecent

JSON (JavaScript Object Notation) should represent numbers the same way as Javascript does.

Very large numbers should be serialized to JSON as strings and handled in Javascript using libraries like bignumber.js which is already used in the framework.

alko89 avatar Mar 13 '19 21:03 alko89

That is a workaround.

But as a general best practice, I disagree. JSON should represent numbers the way JSON is defined. Specifically, like this:

Screen Shot 2019-03-13 at 5 34 35 PM

JavaScript should handle these numbers properly, like you say, with BigNum or equivalent. Or at a minimum should throw when encountering these large numbers.

But to adapt the JSON to the limitations of Javascript... that is backwards. There are already enough deployed systems using JSON that we should work with them rather than making everybody else change.

Here is one example, tweet IDs in the Twitter API. https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object

They do also use a string as you recommend. But it reasonable to expect systems to also be able to work with the integer ID.

fulldecent avatar Mar 13 '19 21:03 fulldecent

JSON defines numbers the same as Javascript does, you can see that here: https://tools.ietf.org/html/rfc8259#section-6

I think the only safe way to represent large numbers (or any financial transaction due to the fact all numbers are basically floating points) in Javascript is using strings.

How to handle this in other systems depends on the implementation. In Python you would use something like json.load(file_json, parse_int=str) or implement a custom encoder/decoder.

alko89 avatar Mar 13 '19 22:03 alko89

Definitely an issue with Javascript, but we all know that JS is a jungle ;). I agree we cover this use case with a note somewhere in our docs. @typwrtr

xpepermint avatar Mar 14 '19 08:03 xpepermint

Feels like a warning should be emitted somewhere if number is an int and > MAX_SAFE_INTEGER.

sbelak avatar Mar 22 '19 17:03 sbelak

Converting to string and comparing would definitely solve the problem.

Valentine-Mario avatar Mar 26 '19 16:03 Valentine-Mario

@Valentine-Mario yes, this is how the framework works at the moment.

xpepermint avatar Mar 26 '19 17:03 xpepermint

Converting a number to string BEFORE the JSON is parsed would solve this problem.

In other words, we specifically do not allow this input:

{"tokenId": 9999999999999999999999999}

Alternatively, parsing the above number as a string is acceptable if it is documented:

...
ob.tokenId == "9999999999999999999999999"

Both are good solutions. But at current I disagree with the test case ctx.is(toString(9999999999999999999999999), '1e+25'). Because a legitimate input is accepted and an inaccurate response is given.



More notes on documenting JSON parsers.

JavaScript's number-to-string conversion functions are well-documented in ES6 7.1.12.1. Also, the built-in JavaScript JSON.parse is well documented in 24.3.1.

However all third-party JSON parsers for JavaScript listed on json.org also truncate large numbers inputs. And all of them are wrong because they do not document this behavior or throw an exception.

Here's an airplane analogy that explains the problem of building interfaces on top of interfaces. On most fighter jets there is an eject mechanism (here is the F-16's) which is a large yellow pull directly at the crotch of the pilot seat or otherwise within reach. The jet's instruction manual explains emergency egress as well as other simulators. Hint: it is a lot more complicated than just pulling the handle! Imagine if you take a sticker and put in on top of that handle with a generic "pull in case of emergency", covering the "eject" wording and you don't tell pilots how it works. Pilots die because that handle is undocumented and they will not be prepared when they use it. Similarly, software gets hacked when it accepts the input {"tokenId": 9999999999999999999999999} without complaining or explaining what should happen with it.

I have also opened issues on all the JSON parsers to improve the documentation. The implementation maintained by the author of JSON also has this problem. He has closed all my issues summarily and says he will not consider further unless I purchase his book on Amazon and read it. I purchased and am reading and will continue to press the issue.

fulldecent avatar Apr 08 '19 05:04 fulldecent

I just found this relevant reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER

Basically it states that double-precision format can only safely represent numbers between -(2^53 - 1) and 2^53 - 1. The same format applies to numbers inside a JSON structure and is a design limitation of double-precision numbers.

Personally I think this should be addressed on the language specification level (which is not in case of JavaScript) or on the end user application. Having this limitation within the framework could potentially ruin a use case.

alko89 avatar Apr 27 '19 11:04 alko89