flatbuffers icon indicating copy to clipboard operation
flatbuffers copied to clipboard

Change JSON Handling of inf and nan (#8527)

Open nsmithtt opened this issue 10 months ago • 2 comments

JSON doesn't spec how to handle inf and nan, but there are some unofficial ways of encoding these values that are implemented in javascript and python. Specifically, they use:

  • Infinity
  • -Infinity
  • NaN

This changes the default emitting of these values by the idl json printer.

nsmithtt avatar Feb 14 '25 17:02 nsmithtt

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

google-cla[bot] avatar Feb 14 '25 17:02 google-cla[bot]

@dbaileychess can I get a review of this?

nsmithtt avatar Mar 05 '25 01:03 nsmithtt

ping

nsmithtt avatar Aug 14 '25 17:08 nsmithtt

I am concerned this would be a breaking change -- if there is other code that depends on the lower case output to json, this would cause issues. Similarly, currently the grammar specifically calls out lowercase variants.

jtdavis777 avatar Nov 24 '25 12:11 jtdavis777

I am concerned this would be a breaking change -- if there is other code that depends on the lower case output to json, this would cause issues. Similarly, currently the grammar specifically calls out lowercase variants.

That grammar is for fbs, entirely different spec than JSON, so I don't see the relevance here.

I appreciate the concern, it just makes it incompatible will pretty much all other json tools:

  • nlohmann/json
  • python's json library
  • javascript
  • json5 explicitly does define Infinity and NaN, since this is what is ubiquitous among existing json tools: https://spec.json5.org/#numbers

nsmithtt avatar Nov 24 '25 14:11 nsmithtt

@aardappel probably could use your opinion on this one.

jtdavis777 avatar Nov 24 '25 15:11 jtdavis777

Sadly this is a bit of a breaking change, since the dialect of "FlatBuffers JSON" has been using these C++ based inf / nan values since forever.

For one, this PR wouldn't be complete, since the JSON parser is also expecting the C++ style identifiers. If the CI doesn't fail, we're apparently not testing this, but search for "inf" in the code for example to find a few places it is checked for. JSON files generated by FlatBuffers should roundtrip.

We already have a --strict-json flag, so it would make sense to me that would output these new values only when that flag is on. That flag already does other things, like always quote keys, not use // comments etc.

Parsing is harder. It would probably be ok to simply parse both kinds of constant, though that would slow the parser down for something that apparently no-one has needed yet, so not something I think is worth it. The alternative is to again parse different values based on --strict-json.

Given that the actual JSON doesn't support any of these values, our convention is as good as any, so just keeping things as-is would certainly be the simplest.

aardappel avatar Nov 24 '25 15:11 aardappel

Sadly this is a bit of a breaking change, since the dialect of "FlatBuffers JSON" has been using these C++ based inf / nan values since forever.

For one, this PR wouldn't be complete, since the JSON parser is also expecting the C++ style identifiers. If the CI doesn't fail, we're apparently not testing this, but search for "inf" in the code for example to find a few places it is checked for. JSON files generated by FlatBuffers should roundtrip.

We already have a --strict-json flag, so it would make sense to me that would output these new values only when that flag is on. That flag already does other things, like always quote keys, not use // comments etc.

Parsing is harder. It would probably be ok to simply parse both kinds of constant, though that would slow the parser down for something that apparently no-one has needed yet, so not something I think is worth it. The alternative is to again parse different values based on --strict-json.

Given that the actual JSON doesn't support any of these values, our convention is as good as any, so just keeping things as-is would certainly be the simplest.

Ok fair enough. Since no one else seems to have complained about it thus far, perhaps the breakage in compatibility isn't worth it. I also don't have the bandwidth at this time to add support for --strict-json mode, so I will close this PR for now.

nsmithtt avatar Nov 24 '25 21:11 nsmithtt