pyjson5
pyjson5 copied to clipboard
Incorrect serialization of float subclass with custom __str__ method
I have a float subclass with a customized str method, and its str output is appearing in the results of json5.dumps. Here's a simplified example:
>>> import json5
>>> class MyFloat(float):
... def __str__(self):
... return f"<{super().__str__()}>"
>>> print(json5.dumps(MyFloat(1)))
<1.0>
>>> json5.loads(json5.dumps(MyFloat(1))
Traceback (most recent call last):
...
ValueError: <string>:1 Unexpected "<" at column 1
Thanks for reporting the issue. I have reproduced it locally (thanks to the easy repro!). I am a bit surprised that this hasn't been reported before.
It looks like json.dumps(MyFloat(1))
returns 1.0
instead of <1.0>
. I'm not immediately sure why that is; it looks like the json
module is calling repr(MyFloat(1))
instead of str(MyFloat(1))
(and json5 is doing the latter).
My first reaction is that json5
is right and json
is wrong, i.e., we should be calling str
here, not repr
.
That said, I view matching json
's behavior to be much more important than trying to say which one is "right". So, I'll definitely change it, but I'll also poke into this a bit more and see if there's something that explains why json
might be correct to use repr
.
Also, I'll point out that you shouldn't count on loads(dumps())
he to always work in the presence of custom classes, at least not if you want to preserve the fact that the object is a MyFloat
and not a str
, at least not unless you're using a custom parse_float
method. But you probably already know that :).
Thanks for taking a look at this so quick. It looks like the same issue crops up for int
subclasses too.
I did some searching, and found that str
and repr
used to be different for float
before around python 3.2, according to https://bugs.python.org/issue9337. repr
was more precise, which is probably why it was used when json
was originally written. Either way, the methods are identical now, so it doesn't make any difference.
Is there any reason why using float.__str__(MyFloat(1))
directly to sidestep any customization would cause problems?
I did some searching ...
Thanks for those leads! :).
Is there any reason why using
float.__str__(MyFloat(1))
directly to sidestep any customization would cause problems?
I have no idea. Changing the default behavior of any public function can usually cause problems. More importantly, reaching deep into a class's private methods doesn't feel great to me.
I think I can probably get equivalent behavior by actually calling JSONEncoder().encode(MyFloat(1))
and that would provide a stronger guarantee that json5's behavior matches json's. I'll probably code that up and see if it flies.
I forgot that back in 10e4e823acb542c91310c4231ffc37b0cf7c8c57 I removed (the last) dependency on json
.
I don't remember why I didn't want the dependency, exactly: it might have been just for aesthetics, or it might
have been by request from someone.
At any rate, in order to not reintroduce the dependency on json
now, I'll call float.repr and int.repr directly
as you suggested instead.
Okay, I've fixed this in 1f7f8062 and pushed that to PyPI as v0.9.9.
Please let me know if you find any issues, and thanks again!