json icon indicating copy to clipboard operation
json copied to clipboard

NaN and Infinity support

Open ZFail opened this issue 2 years ago • 7 comments

Аdded support for parsing NaN and Infinity. Serializer already supports this.

ZFail avatar Dec 03 '21 12:12 ZFail

As far as I know, neither NaN nor infinity is a part of JSON format

grisumbras avatar Dec 03 '21 13:12 grisumbras

As far as I know, neither NaN nor infinity is a part of JSON format

Yes, but for example comments too. JSON5 is supporting NaN and Infinity. Maybe add by option?

ZFail avatar Dec 03 '21 13:12 ZFail

What do you accept for NaN or infinity?

pdimov avatar Dec 03 '21 16:12 pdimov

What do you accept for NaN or infinity?

JSON

{
  "nan": NaN,
  "infinity": Infinity,
  "minus_infinity": -Infinity
}

parsed as

object obj( {
  { "nan",  std::numeric_limits<double>::quiet_NaN() }, 
  { "infinity", std::numeric_limits<double>::infinity() }, 
  { "minus_infinity", -std::numeric_limits<double>::infinity() }
} );

ZFail avatar Dec 03 '21 18:12 ZFail

This needs to be disabled by default and enabled with an option. See https://www.boost.org/doc/libs/1_77_0/libs/json/doc/html/json/ref/boost__json__parse_options.html

grisumbras avatar Dec 03 '21 18:12 grisumbras

The JSON5 spec also allows +NaN, -NaN and +Infinity:

The IEEE 754 value positive infinity must be the literal characters Infinity and may be prefixed with an optional plus sign. The IEEE 754 value negative infinity must be the literal characters -Infinity. The IEEE 754 value NaN must be the literal characters NaN and may be prefixed with an optional plus or minus sign.

pdimov avatar Dec 03 '21 18:12 pdimov

Codecov Report

Merging #657 (9dee834) into develop (ee01580) will decrease coverage by 0.02%. The diff coverage is 98.80%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #657      +/-   ##
===========================================
- Coverage    99.19%   99.18%   -0.02%     
===========================================
  Files           70       70              
  Lines         6851     6996     +145     
===========================================
+ Hits          6796     6939     +143     
- Misses          55       57       +2     
Impacted Files Coverage Δ
include/boost/json/basic_parser.hpp 100.00% <ø> (ø)
include/boost/json/basic_parser_impl.hpp 98.36% <98.80%> (+0.02%) :arrow_up:

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update ee01580...9dee834. Read the comment docs.

codecov[bot] avatar Apr 30 '23 14:04 codecov[bot]

Hi! I would like to pick up this PR (@ZFail has added me to his repo). We need this feature either way, but I'd like to know if there is interest in merging it.

I've already committed optional support for Infinity/NaN parsing through parse_options::allow_infinity_and_nan. If there is interest, then I'll add more tests.

Another question is regarding leading + and -NaN support. As @pdimov mentioned, JSON5 supports these. But IMO, supporting leading + signs specifically for Infinity and NaN is not a good idea. In JSON5 this syntax is supported for numbers in general, which is not true for JSON. If there is interest in supporting leading + signs, then that should either be extended to all numbers (with the parse option renamed to allow_extended_numbers), or implemented separately with it's own parse option.

-NaN is more of a grey area. It could definitely make sense to support this, given that the sign of a NaN is actually observable using std::signbit(). But in that case the serializing code should also be modified to produce -NaN for such values. Otherwise we loose the ability to roundtrip numbers, which is the main point of this feature.

glebov-andrey avatar Apr 30 '23 16:04 glebov-andrey

This is still not reviewed and merged?

vinniefalco avatar May 01 '23 01:05 vinniefalco

This is still not reviewed and merged?

Yeah, it was sort of forgotten about for a while. Since I needed to rebase this onto 1.82.0 anyway, I decided to take over the PR.

I also left a couple TODOs in the code, if someone could have a look at them: https://github.com/boostorg/json/pull/657/commits/2dbb08f3fccbe8599e204655a68e3f256f4959ba#diff-2cc706ac1836b6457314bc5f7f00837fc38310395e9d1486782a82b7d5b6e583R105 https://github.com/boostorg/json/pull/657/commits/2dbb08f3fccbe8599e204655a68e3f256f4959ba#diff-ac0f59cf63d51320da1fc37682921e31221528ac2e1c8412363fa3f4d2dead5cR769

glebov-andrey avatar May 01 '23 10:05 glebov-andrey

The massive switch/case in basic_parser::parse_document() is not the most easily extendable thing. Here is a template recursion based replacement, if you're interested: https://github.com/ZFail/json/commit/887a3e8223f6935c86e08f58ee35a13ae5117457 I put it in a separate branch in case such an implementation is not acceptable, e.g. due to debug performance.

glebov-andrey avatar May 01 '23 15:05 glebov-andrey

Why specifically Infinity and NaN? My cursory research shows that the only implementation that outputs these is Python.

grisumbras avatar May 04 '23 13:05 grisumbras

Why specifically Infinity and NaN? My cursory research shows that the only implementation that outputs these is Python.

I think the original reasoning is simple - that's what Boost.JSON serializes them as: https://github.com/boostorg/json/blob/ee01580cfde818e31d8ea41c2e4056bcff9e3018/include/boost/json/detail/ryu/detail/common.hpp#L85-L99 Given that JSON is closely related to JS, and JS has Infinity and NaN as global properties, this actually seems reasonable. That's also probably the reasoning behind JSON5's spelling.

glebov-andrey avatar May 04 '23 19:05 glebov-andrey

While JS has them, it doesn't parse them as numbers. But I guess, this is a common enough approach to pick it.

One question. I'm not sure we actually need the template parameter for whether special number support is enabled. Can you test if the template parameter affects performance and how much it affects binary size?

Also, please rebase on the current develop and squash. after that I will approve CI runs.

grisumbras avatar May 05 '23 15:05 grisumbras

Are you still planning to do this? I've made a few changes to parsing of JSON literals that makes implementing this change much easier, and also simplifies getting full coverage.

grisumbras avatar May 22 '23 05:05 grisumbras

Are you still planning to do this? I've made a few changes to parsing of JSON literals that makes implementing this change much easier, and also simplifies getting full coverage.

I'm really sorry! The last few weeks have been rather busy for me, and I missed this comment. I take it, this PR has now been superseded by #899, and so there's no point in doing anything here?

Regarding replacing the template parameter with a runtime one: I had initially assumed that the non-conditional checks were the reason for the performance regressions seen in https://github.com/boostorg/json/pull/657#issuecomment-985550216, but clearly that's not the case since #899 is doing fine. A quick binary size comparison with Windows/Clang/Release-LTO shows a 134KiB improvement for the test binary, so runtime checks seem like the way to go.

I'd be glad to help with anything else I can regarding this

glebov-andrey avatar May 31 '23 16:05 glebov-andrey