cpython
cpython copied to clipboard
Proper or custom JSON serialization of non-finite float values
Feature or enhancement
The goal of this feature is to allow the JSON serializer in stdlib to serialize non-finite values (NaN, Inf, -Inf) according to the JSON specification. Going beyond just conforming to the spec, we could allow for custom serialization behavior.
Previous discussion
This problem was previously discussed in #84813, and a related PR was submitted in 2019, however, @mdickinson suggested I open a fresh issue where we can discuss the implementation in-depth.
Pitch
Currently, python's default JSON serializer encodes values like NaN
as-is, with the explanation being that many JS-based JSON libraries also do this, and that the corresponding parsers can handle such non-conforming input.
In reality, most major browsers do not support this type of encoding and even NodeJS(v14.16.0) acts according to the previously-linked JSON spec.
The keyword argument allow_nan
makes the serializer throw when encountering non-finite values when set to true, but I'd argue it is paramount to ensure compatibility with the spec and modern browsers. Changing the default behavior is of course not needed or possible at this point.
When implementing this feature, there are two main decisions to make.
Firstly, it has to be decided if allow_nan
should be extended to take more datatypes like strings and callables, or if we should create a separate argument for this functionality.
Re-purposing allow_nan
would make the control over such behavior centralized, however the name is very limiting.
It doesn't say anything about other non-finite values, and without looking at the docs, one would think it only takes bool values.
Secondly, it has to be decided how far we want to take this feature.
Do we want to have pre-defined cases like as_is
, throw
, and to_null
, or do we want to allow the user to pass their own callable? The latter is implemented by the linked PR. Having both options is also a possibility.
Overall, each combination of decisions has its advantages and drawbacks. Since I wasn't a part of such discussions before, I don't have a preference. All I want is to see this feature get implemented, and I can create a PR once consensus is reached.
@Dzeri96 Thanks for opening the issue!
There was a fair bit of confusion in #84813; it would be good to eliminate that confusion up front here. To clarify, when you say:
to serialize non-finite values (NaN, Inf, -Inf) according to the JSON specification
An option for this already exists, via allow_nan = False
. I think you (and others) were asking for something different, namely an option to convert NaNs and Infinities to null. Is that correct?
An option for this already exists, via
allow_nan = False
.
After taking a second look at the spec I guess throwing an exception instead of producing any output could be considered conform.
In any case, yes, I'm suggesting we add options for conversion to null
or even custom behavior as explained above.
I know the last week has probably been very hectic with the release of 3.11, but I'd hate this issue to be buried like the MR from 2019. Any tips of how to promote discussion and push this to get reviewed? @mdickinson
@Dzeri96 Maybe a ping on https://discuss.python.org? I'm afraid that I'm not personally likely to have time to look at this this side of Christmas.
A thread has been created here. Let's close this issue and continue discussion there.
Re-opening this issue, since I think the discussion in the thread linked above pointed to a reasonable path forward.
There's a half-solution in #115246: it adds the allow_nan="null"
support, but not the proposed DeprecationWarning
.
@mdickinson I know I promised that I'd deliver a fix for this a year ago, but some things came up... You know how it goes. I guess you took over and implemented everything needed?
@Dzeri96 That's fine, and yes, I absolutely know how it goes. :-)
If you have any cycles to spare, it would be great if you could take a look at #115246 and let me know whether it seems like a reasonable solution to you.
@mdickinson just posted a comment on the MR. Since you own the feature branch, it's best you implement my proposed changes, in case you agree with me. Let me know if I should get involved in any other way.
There's a half-solution in #115246: it adds the
allow_nan="null"
support, but not the proposedDeprecationWarning
.
The solution in #115246 is now a full solution.