requests
requests copied to clipboard
Make response.next lazily computable.
When attempting to debug #4122 I attempted to use the standard method for debugging this, which is to set allow_redirects=False and to manually check the headers. Unfortunately, this no longer works since @kennethreitz shipped 2.15.0, as Requests now always computes the first redirect target in order to write it into Response.next. This makes it very frustrating to debug problems with Requests redirect following because it cannot be disabled, meaning that monkeypatching is required to prevent Requests blowing up.
I propose that we should either make response.next lazily computable, such that it is only calculated when actually requested, or we should remove it.
Expected Result
I expected that if allow_redirects=False was provided Requests would not run any of its redirect handling code.
Actual Result
It did.
Reproduction Steps
import requests
url = 'http://www.yftxt.com/plus/xiazai_yfx.php?open=2&id=12914&uhash=7fadfdb165fadc7362f9dc90'
requests.get(url, allow_redirects=False)
System Information
$ python -m requests.help
{
"chardet": {
"version": "3.0.3"
},
"cryptography": {
"version": ""
},
"implementation": {
"name": "CPython",
"version": "3.6.0"
},
"platform": {
"release": "16.7.0",
"system": "Darwin"
},
"pyOpenSSL": {
"openssl_version": "",
"version": null
},
"requests": {
"version": "2.17.3"
},
"system_ssl": {
"version": "100020cf"
},
"urllib3": {
"version": "1.21.1"
},
"using_pyopenssl": false
}
we should remove it.
+1
I don't think the implementation is good and I don't think having this on a Response object is the right place for it. There's no way for people to use that Request object with anything other than a Session and if they're already using a Session they have what they need.
We should confirm though: @kennethreitz, what was the rationale for adding it? I assume you had a use-case in mind when you did. If so, it might be better to consider whether there is an alternative factoring of the code that doesn't run into this problem.
I added it to make it simple for people to follow redirect chains manually when allow_redirects=False.
Sure, and that makes sense. The question is: why not have them just call resolve_redirects themselves? This doesn't avoid any of the logic of resolve_redirects, so most of the reasons for manually following redirects go away.
More generally, this breaks allow_redirects=False, because it automatically always follows the first redirect up to just before sending. If we have any errors in that code, it becomes impossible to not hit them, as I've now discovered.
I'm not against removing it, I don't think anyone will be missing out, but if we can find another way to add this API that is cleaner, I'd be for it (for example, changing it to just returning the next url perhaps?).
Should have put more though into it perhaps before pushing it out.
(for example, changing it to just returning the next url perhaps?).
Doesn't resolve the issue: literally the concern is executing the code in resolve_redirects in any form.
Well, I think it's nice to have, but if it's problematic, i'm fine with removing it. I just think it's a nice touch.
I'm not opposed to having it, but it has to be built in a way that doesn't auto-run the code in resolve_redirects. If it computes lazily, that's fine, but it means the Response holds a reference to the Session. Assuming we're ok with that, I'm ok with it as an alternative.
+1 to lazy evaluation
torn about holding onto the session...
I absolutely am against the Response object holding onto the session. Let me explain it from this perspective:
Currently Requests can appear to leak sockets/memory when users pass stream=True and refuse to consume the socket or return it to the connection pool. This is because the Response has a reference to a urllib3.Response which in turn has references to the connection pool and underlying httplib response & socket. The Session also has references to the connection pool. Now Responses, even those created via requests.get, will have a Session instance attached so they can maybe follow redirects? Uh, that's gross. That's especially gross since users will now have all the extra allocation of a session who are just using the functional API. And if they're using it enough, they could build up something awful (depending on how long their Response objects live of course). This just seems like a recipe for disaster.
i don't disagree
I'm with @sigmavirus24 on this. Hanging the Session instance on a Response conceptually seems pretty bizarre and prone to causing bugs. We also have tests suggesting we guarantee a Response object to be pickleable. Would that still hold true with the Session object attached?
If there's a way to maintain the interface without persisting the Session, then great. If not, it may be worth looking at alternatives/removing this method.
There's too much knowledge specific to a Session object for us to move the logic to the Response object, in my opinion, and I don't see a better way than the suggestions provided above, but I haven't thought too hard about how to preserve the interface, frankly.
I have decided to defer to @Lukasa as to what to do here. Either:
- Make it lazily computable.
- Remove it completely (no one's likely using it yet, so no harm done), replace with documentation.
If needed, I'll make the decision.
I suspect my inclination is to replace it completely. More generally we can reconsider how we handle redirects for 3.0. For example, resolve_redirects could be turned into an iterator of Request objects and the Session can handle sending those requests.
That sounds nice! I'm a big fan of the SessionRedirectMixin, but it could use some love.