requests
requests copied to clipboard
iter_content() checks for < 1 chunk_size
With requests v2.20.1, supplying a value of 0 for chunk_size causes iter_content() to hang. This made debugging code that dynamically computes a value for chunk_size difficult. These two lines explicitly enforce chunk_size being a positive integer.
Hi @alex-wenzel, thanks for taking a look at this. I believe this is actually a duplicate of #4876. This is a new behaviour introduced with the last release of urllib3 where 0 no longer denotes "any size". I believe this is probably a bug since it doesn't make sense to iterate in chunks of 0.
In regards to the PR, I don't think we want to handle the issue with an exception. We handle 0 correctly in iter_slices and preventing that would further break the API expectations. We'll track next steps in #4876 for now.
/cc @sethmlarson
@nateprewitt That makes sense, thanks for the quick review/reply!
I'll investigate, thanks for the report.
What I'm seeing so far is that urllib3 hasn't (in recent history, tested urllib3 1.18 to 1.24.1) not-blocked indefinitely when using amt=0 into HTTPResponse.stream(). Is this issue different from the one described into #4876 which appears to be about amt=None?
@sethmlarson after a closer look, it is not. None does appear to work correctly with all of the combinations I tried today. It looks like Requests has never supported this actually, at least not since 2.0.
@alex-wenzel I'm going to reopen this since I agree we can do a better job here. Part of me would be more inclined to treat 0 the same way we treat None. That would require special-casing logic for 0->None but the alternative of requiring everyone implement this themselves (len('')-> None) seems silly. The case of amt<0 is already handled by the standard library. @sigmavirus24, @sethmlarson, any thoughts on that?
What do other streaming interfaces typically do with a chunk size of zero? Any examples in the stdlib? I'd be inclined to copy whatever is done there.
For what it's worth, the code I was debugging includes:
file_size = int(r.headers['Content-Length'])
chunk_size = int(file_size / 50)
...so files smaller than 50 bytes would hang forever. I have to imagine this would be one of the most common reasons a user would inadvertently set chunk_size=0 and in this case I think having it behave like None would be reasonable.
Also, slightly tangential to this thread, but it looks like my most recent changes failed in AppVeyor with an ImportError in Python 3.7 specifically that doesn't look related to my changes.
I took a look at getting this added into 2.27.0 but it has some unfortunate side effects. Non-streamed requests currently succeed at returning data from iter_content for non-positive integers. While I agree that's nonsensical, adding this behavior breaks that workflow, which we won't be able to do in a minor release.
For the time being, this is an unfortunate behavior which we could definitely document for iter_content. We won't be able to merge this PR until it's reworked to maintain current behavior though. Unfortunately, I don't see an immediate way to do that.