pyjson5 icon indicating copy to clipboard operation
pyjson5 copied to clipboard

Incorrect serialization of float subclass with custom __str__ method

Open ryanmarquardt opened this issue 2 years ago • 3 comments

I have a float subclass with a customized str method, and its str output is appearing in the results of json5.dumps. Here's a simplified example:

>>> import json5
>>> class MyFloat(float):
...     def __str__(self):
...         return f"<{super().__str__()}>"

>>> print(json5.dumps(MyFloat(1)))
<1.0>

>>> json5.loads(json5.dumps(MyFloat(1))
Traceback (most recent call last):
...
ValueError: <string>:1 Unexpected "<" at column 1

ryanmarquardt avatar Jul 31 '22 15:07 ryanmarquardt

Thanks for reporting the issue. I have reproduced it locally (thanks to the easy repro!). I am a bit surprised that this hasn't been reported before.

It looks like json.dumps(MyFloat(1)) returns 1.0 instead of <1.0>. I'm not immediately sure why that is; it looks like the json module is calling repr(MyFloat(1)) instead of str(MyFloat(1)) (and json5 is doing the latter).

My first reaction is that json5 is right and json is wrong, i.e., we should be calling str here, not repr.

That said, I view matching json's behavior to be much more important than trying to say which one is "right". So, I'll definitely change it, but I'll also poke into this a bit more and see if there's something that explains why json might be correct to use repr.

Also, I'll point out that you shouldn't count on loads(dumps())he to always work in the presence of custom classes, at least not if you want to preserve the fact that the object is a MyFloat and not a str, at least not unless you're using a custom parse_float method. But you probably already know that :).

dpranke avatar Jul 31 '22 21:07 dpranke

Thanks for taking a look at this so quick. It looks like the same issue crops up for int subclasses too.

I did some searching, and found that str and repr used to be different for float before around python 3.2, according to https://bugs.python.org/issue9337. repr was more precise, which is probably why it was used when json was originally written. Either way, the methods are identical now, so it doesn't make any difference.

Is there any reason why using float.__str__(MyFloat(1)) directly to sidestep any customization would cause problems?

ryanmarquardt avatar Jul 31 '22 22:07 ryanmarquardt

I did some searching ...

Thanks for those leads! :).

Is there any reason why using float.__str__(MyFloat(1)) directly to sidestep any customization would cause problems?

I have no idea. Changing the default behavior of any public function can usually cause problems. More importantly, reaching deep into a class's private methods doesn't feel great to me.

I think I can probably get equivalent behavior by actually calling JSONEncoder().encode(MyFloat(1)) and that would provide a stronger guarantee that json5's behavior matches json's. I'll probably code that up and see if it flies.

dpranke avatar Jul 31 '22 22:07 dpranke

I forgot that back in 10e4e823acb542c91310c4231ffc37b0cf7c8c57 I removed (the last) dependency on json. I don't remember why I didn't want the dependency, exactly: it might have been just for aesthetics, or it might have been by request from someone.

At any rate, in order to not reintroduce the dependency on json now, I'll call float.repr and int.repr directly as you suggested instead.

dpranke avatar Aug 01 '22 21:08 dpranke

Okay, I've fixed this in 1f7f8062 and pushed that to PyPI as v0.9.9.

Please let me know if you find any issues, and thanks again!

dpranke avatar Aug 01 '22 22:08 dpranke