requests icon indicating copy to clipboard operation
requests copied to clipboard

Using a `json.JSONDecoder` fails when simplejson is present

Open ricellis opened this issue 5 years ago • 14 comments

It appears that when simplejson is in the environment requests preferentially imports it. However, the arguments accepted by JSONDecoder are inconsistent between the standard library json and simplejson leading to errors when using the standard json.JSONDecoder when simplejson is in the environment.

Your documentation for response.json() says:

Parameters: | **kwargs – Optional arguments that json.loads takes.

The documentation for json.loads says:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

and

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.

I expected to be able to use a custom json.JSONDecoder without issue and indeed this works on Python 2.7, but in Python 3.7 it fails.

I can see that the issue of preferentially importing simplejson has been raised a few times e.g. https://github.com/requests/requests/pull/2516 https://github.com/requests/requests/issues/3052

and that it is slated for removal in Requests 3, which I guess will resolve this issue.

If it is not possible to resolve this issue some other way in Requests 2.x then it would be nice if the documentation around response.json() was updated to make it clear that the arguments to json.loads could be either simplejson.loads or the standard lib json.loads depending on the environment since the preferential import is effectively undocumented and non-obvious.

Expected Result

A successful JSON load when using r.json(cls=json.JSONDecoder).

Actual Result

    print(r.json(cls=json.JSONDecoder))
  File ".../python3.7/site-packages/requests/models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File ".../python3.7/site-packages/simplejson/__init__.py", line 535, in loads
    return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'encoding'

Reproduction Steps

pip install requests
pip install simplejson
import json
import requests

r = requests.get('http://localhost:5984')
print(r.json(cls=json.JSONDecoder))

System Information

Note this issue does not occur in Python 2.7.x.

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.7"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.7.0"
  },
  "platform": {
    "release": "18.0.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.20.0"
  },
  "system_ssl": {
    "version": "1000210f"
  },
  "urllib3": {
    "version": "1.24"
  },
  "using_pyopenssl": false
}

ricellis avatar Oct 22 '18 11:10 ricellis

Hi @ricellis, thanks for revisiting this. I don't believe we'll be seeing this removed in the near future until 3.0.0 is ready. If you'd like to make a small addendum to the documentation about this, we'd be happy to review it.

nateprewitt avatar Oct 24 '18 05:10 nateprewitt

Thanks @nateprewitt I can update the docs, but would you be willing to accept a patch that protects people from the error if they are expecting to be using json, but simplejson happens to be present in the env (probably unbeknownst to them).

For example adding a check to Response.json() that uses the built-in json package if cls extends the built-in json.decoder.JSONDecoder. Something like this https://gist.github.com/ricellis/0690ebfc7f1a56df64df87c7a5251e77

ricellis avatar Oct 25 '18 15:10 ricellis

Your proposed change breaks the deterministic behaviour of that function which is not ideal in the slightest. I don't think we can accept that.

sigmavirus24 avatar Oct 26 '18 02:10 sigmavirus24

breaks the deterministic behaviour of that function

It's annoying that the json() method could end up using a different module to what is used elsewhere, but that seems better than an error to me. It is deterministic though, the outputs are governed by the inputs and I think it only impacts the case that is currently broken:

simplejson cls current proposal
yes extends json.JSONDecoder Error Decoded by json using cls
yes None Decoded by simplejson Decoded by simplejson
no extends json.JSONDecoder Decoded by json using cls Decoded by json using cls
no None Decoded by json Decoded by json

not ideal in the slightest

It was just a first stab at something that might work and I was trying to contain the change to just the specific area of the problem.

I agree it is not the cleanest thing to do, but neither is swapping the JSON module with an API incompatible one. That's where this issue is different from the others that have been reported previously around the preferential import - this has identified a place where simplejson and json are not API compatible yet requests exposes that API to the end user through the Response.json() kwargs.

That in itself might not be a problem when the requests caller is also in control of the module environment. As you can see from our linked issue though the problem for us comes when a user of our library includes another module that itself has a dependency on simplejson. We are using the requests API in good faith, but it is broken by a transitive dependency over which we have no control. I think the discussion around simplejson that has preceded this issue didn't have to account for that.

Perhaps someone can think of a better option than my proposal or maybe an option to disable the simplejson preferential import could be revisited in light of this particular scenario?

Or if you're not interested in fixing 2.x at all then just say so and I'll submit a docs patch and spend my time implementing a workaround in our library instead.

ricellis avatar Oct 26 '18 09:10 ricellis

@nateprewitt @sigmavirus24 any thoughts on this ^?

ricellis avatar Nov 07 '18 09:11 ricellis

Simplejson is slower than stdlib json implementation in python 3, why don't you want to allow to override the default behavior at least?

SharpEdgeMarshall avatar Mar 31 '20 11:03 SharpEdgeMarshall

I ran into this problem when we added a third-party library that had a transitive dependency on simplejson. That consequently broke all of our error handling, because requests now started throwing simplejson.errors.JSONDecodeError instead of json.JSONDecodeError that we had catches for. Judging from the referenced issues above, more users are affected.

ruuda avatar Jul 29 '20 08:07 ruuda

Having the same issue as the user above. requests should just be using one or the another. Using either or conditionally only creates confusion.

verata-veritatis avatar Jan 27 '21 10:01 verata-veritatis

Any update :)?

BarryThrill avatar May 06 '21 06:05 BarryThrill

I came here because berserk uses request, and this issue comes up. Can you plz give a sttatus update?

rwst avatar Dec 12 '21 07:12 rwst

See also https://github.com/psf/requests/issues/3052

rwst avatar Dec 12 '21 08:12 rwst

Our team has run into a similar issue, where we have simplejson installed but we want to use a custom json encoder class in the prepare_body method which now fails

krystofernewman avatar Jan 31 '22 20:01 krystofernewman

Hi @krystofernewman, it doesn't sound like your issue is related to this one directly. We're specifically discussing simplejson's impact on Response.json(). There are likely similar issues on the json Request argument but that's a topic that would be handled separately.

simplejson support was added in a very different time (almost a decade ago) but has been the default for the majority of the library's life. We agreed some time ago that a documentation update noting this would be accepted but cannot change the behavior until a major version bump.

nateprewitt avatar Jan 31 '22 20:01 nateprewitt

@nateprewitt gotcha thanks for the response, I can open a separate issue for the impact of simplejson on the encoding of the request's body preparation if that works

krystofernewman avatar Jan 31 '22 20:01 krystofernewman