requests
requests copied to clipboard
Using a `json.JSONDecoder` fails when simplejson is present
It appears that when simplejson
is in the environment requests
preferentially imports it. However, the arguments accepted by JSONDecoder
are inconsistent between the standard library json
and simplejson
leading to errors when using the standard json.JSONDecoder
when simplejson
is in the environment.
Your documentation for response.json()
says:
Parameters: | **kwargs – Optional arguments that json.loads takes.
The documentation for json.loads
says:
json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
and
To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.
I expected to be able to use a custom json.JSONDecoder
without issue and indeed this works on Python 2.7, but in Python 3.7 it fails.
I can see that the issue of preferentially importing simplejson
has been raised a few times e.g.
https://github.com/requests/requests/pull/2516
https://github.com/requests/requests/issues/3052
and that it is slated for removal in Requests 3, which I guess will resolve this issue.
If it is not possible to resolve this issue some other way in Requests 2.x then it would be nice if the documentation around response.json()
was updated to make it clear that the arguments to json.loads
could be either simplejson.loads
or the standard lib json.loads
depending on the environment since the preferential import is effectively undocumented and non-obvious.
Expected Result
A successful JSON load when using r.json(cls=json.JSONDecoder)
.
Actual Result
print(r.json(cls=json.JSONDecoder))
File ".../python3.7/site-packages/requests/models.py", line 897, in json
return complexjson.loads(self.text, **kwargs)
File ".../python3.7/site-packages/simplejson/__init__.py", line 535, in loads
return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'encoding'
Reproduction Steps
pip install requests
pip install simplejson
import json
import requests
r = requests.get('http://localhost:5984')
print(r.json(cls=json.JSONDecoder))
System Information
Note this issue does not occur in Python 2.7.x.
$ python -m requests.help
{
"chardet": {
"version": "3.0.4"
},
"cryptography": {
"version": ""
},
"idna": {
"version": "2.7"
},
"implementation": {
"name": "CPython",
"version": "3.7.0"
},
"platform": {
"release": "18.0.0",
"system": "Darwin"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.20.0"
},
"system_ssl": {
"version": "1000210f"
},
"urllib3": {
"version": "1.24"
},
"using_pyopenssl": false
}
Hi @ricellis, thanks for revisiting this. I don't believe we'll be seeing this removed in the near future until 3.0.0 is ready. If you'd like to make a small addendum to the documentation about this, we'd be happy to review it.
Thanks @nateprewitt I can update the docs, but would you be willing to accept a patch that protects people from the error if they are expecting to be using json
, but simplejson
happens to be present in the env (probably unbeknownst to them).
For example adding a check to Response.json()
that uses the built-in json package if cls
extends the built-in json.decoder.JSONDecoder
. Something like this https://gist.github.com/ricellis/0690ebfc7f1a56df64df87c7a5251e77
Your proposed change breaks the deterministic behaviour of that function which is not ideal in the slightest. I don't think we can accept that.
breaks the deterministic behaviour of that function
It's annoying that the json()
method could end up using a different module to what is used elsewhere, but that seems better than an error to me. It is deterministic though, the outputs are governed by the inputs and I think it only impacts the case that is currently broken:
simplejson |
cls |
current | proposal |
---|---|---|---|
yes | extends json.JSONDecoder |
Error | Decoded by json using cls |
yes | None | Decoded by simplejson |
Decoded by simplejson |
no | extends json.JSONDecoder |
Decoded by json using cls |
Decoded by json using cls |
no | None | Decoded by json |
Decoded by json |
not ideal in the slightest
It was just a first stab at something that might work and I was trying to contain the change to just the specific area of the problem.
I agree it is not the cleanest thing to do, but neither is swapping the JSON module with an API incompatible one. That's where this issue is different from the others that have been reported previously around the preferential import - this has identified a place where simplejson
and json
are not API compatible yet requests
exposes that API to the end user through the Response.json()
kwargs.
That in itself might not be a problem when the requests
caller is also in control of the module environment. As you can see from our linked issue though the problem for us comes when a user of our library includes another module that itself has a dependency on simplejson
. We are using the requests
API in good faith, but it is broken by a transitive dependency over which we have no control. I think the discussion around simplejson
that has preceded this issue didn't have to account for that.
Perhaps someone can think of a better option than my proposal or maybe an option to disable the simplejson
preferential import could be revisited in light of this particular scenario?
Or if you're not interested in fixing 2.x at all then just say so and I'll submit a docs patch and spend my time implementing a workaround in our library instead.
@nateprewitt @sigmavirus24 any thoughts on this ^?
Simplejson is slower than stdlib json implementation in python 3, why don't you want to allow to override the default behavior at least?
I ran into this problem when we added a third-party library that had a transitive dependency on simplejson
. That consequently broke all of our error handling, because requests
now started throwing simplejson.errors.JSONDecodeError
instead of json.JSONDecodeError
that we had catches for. Judging from the referenced issues above, more users are affected.
Having the same issue as the user above. requests
should just be using one or the another. Using either or conditionally only creates confusion.
Any update :)?
I came here because berserk uses request, and this issue comes up. Can you plz give a sttatus update?
See also https://github.com/psf/requests/issues/3052
Our team has run into a similar issue, where we have simplejson installed but we want to use a custom json encoder class in the prepare_body method which now fails
Hi @krystofernewman, it doesn't sound like your issue is related to this one directly. We're specifically discussing simplejson's impact on Response.json()
. There are likely similar issues on the json
Request argument but that's a topic that would be handled separately.
simplejson
support was added in a very different time (almost a decade ago) but has been the default for the majority of the library's life. We agreed some time ago that a documentation update noting this would be accepted but cannot change the behavior until a major version bump.
@nateprewitt gotcha thanks for the response, I can open a separate issue for the impact of simplejson on the encoding of the request's body preparation if that works