Support user-defined functions for serialising Inf and NaN
Currently, Inf and NaN are translated to Infinity and NaN if allow_inf = true is passed to JSON.write().
Unfortunately, the standard JSON parser in the browser does not support this syntax. Typical workarounds are regex substitution of Infinity to "Infinity", which is slow and error-prone.
If only Inf translation is needed, a nice hack is to translate Inf to 1e1000 which is converted to Infinity by the built-in number parser rather than by the JSON parser.
If also NaN is needed, the only possibility I am aware of is via reviver. But as there is no standard format, I could propose, I thought a customisable solution would be nice.
I propose to support user-defined translations via value types;
Inf, -Inf and NaN are output as RawType() for which the users can define their own values, e.g.
JSON3.rawbytes(::Val{Inf}) = codeunits("1e1000")
JSON3.rawbytes(::Val{-Inf}) = codeunits("-1e1000")
JSON3.rawbytes(::Val{NaN}) = codeunits("__nan__")
so that
julia> JSON3.write((a = Inf, b = -Inf32, c = NaN), allow_inf = true)
"{\"a\":1e1000,\"b\":-1e1000, \"c\":\"__nan__\"}"
I've prepared a PR, which I will submit for consideration. There is a slight perfomance reduction of 1% vs. the existing treatment of Inf, while NaNs are treated a bit faster. I'd consider the changes negligable, given the fact that the occurrence of the these values rather low.
... or would you rather prefer a version via keyword argument?
I've added another version with keyword argument under the branch hh-infinity2. I first tried a Dict mapping but that performed way slower, then I went with a functional mapping.
julia> mapping(x) = x == Inf ? "__inf__" : x == -Inf ? "-1e1000" : "__nan__"
julia> JSON3.write([Inf32,-Inf32, NaN], inf_mapping = mapping)
"[__inf__,-1e1000,__nan__]"
EDIT: corrected return value
One thought; could it be that a number is not finite but also not NaN and not Inf? The the default mapping should probably rather look
_std_mapping(x) = x == Inf ? "Infinity" : x == -Inf ? "-Infinity" : isnan(x) ? "NaN" : string(x)
After some re-thinking, I have a slight preference for the kwarg-solution, because users could serialize for different purposes/backends in one application.
Can someone help here? @LilithHafner @JeffBezanson (just saw that you committed to JSON3 recenctly)
I think having a serialization of Inf and NaN that passes normal browsers so that users can write their own revivers is something strongly desirable.
@hhaensel, sorry this stalled and you put a lot of effort into working on things. I'm fine if you want to go with the approach in your PR. I haven't made it very clear in this repo anywhere, but I've actually been working on a JSON.jl 1.0 release which takes the best parts of JSON.jl + JSON3.jl and combines them into one package that I'm proposing as a 1.0 release for the JSON.jl package (and JSON3.jl would be deprecated). So to that regard, if you want me to merge your PR and we can tag a release to unblock you, I'm fine with that.
For the other work I'm doing, I want to address this issue, and I'm wondering why we wouldn't just allow passing JSON.json(x; ninf="-1e1000") ? Was there performance issues with that approach and so you went with the function approach?
Yes, IIRC it was for performance reasons.
In #294 I have a tabular comparison of the different approaches, which I repeat below: The two implementations don't show a large difference in performance. So the choice is rather a matter of taste, I think. Both versions meanwhile support reading.
If you think we should rather go with a named tuple approach I could also try modifying the tuple version. But please comment before I put some effort in that.
fn_mapping(x::Real) = x == Inf ? "\"__inf__\"" : x == -Inf ? "\"__neginf__\"" : "\"__nan__\""
tuple_mapping = ("\"__inf__\"", "\"__neginf__\"", "\"__nan__\"")
x = rand([Inf, NaN, -Inf], 1000)
y = JSON3.write.(x, inf_mapping=fn_mapping)
jy = join(y, "\", \"")
I obtain
| Operation | fn_mapping | tuple_mapping | allow_inf = true |
|---|---|---|---|
JSON3.write(x, …) |
3.688 μs | 4.114 μs | 3.375 μs |
JSON3.read.(y, …) |
5.908 ms | 6.035 ms | 5.927 ms |
JSON3.read.(codeunits.(y), …) |
48.6 μs | 49.7 μs | 46.1 μs |
JSON3.read(jy, …) |
177.292 ns | 307.983 ns | 165.138 ns |
- My tuple implementation is available as branch hh-infinity-tuple.
- The slow parsing performance in the second line results from the
isfile()which can be circumvented since the latest patch - All results measured on Windows 11
we probably close this as JSON v1 has adapted this idea?