requests icon indicating copy to clipboard operation
requests copied to clipboard

iter_content() checks for < 1 chunk_size

Open alex-wenzel opened this issue 6 years ago • 8 comments
trafficstars

With requests v2.20.1, supplying a value of 0 for chunk_size causes iter_content() to hang. This made debugging code that dynamically computes a value for chunk_size difficult. These two lines explicitly enforce chunk_size being a positive integer.

alex-wenzel avatar Dec 28 '18 20:12 alex-wenzel

Hi @alex-wenzel, thanks for taking a look at this. I believe this is actually a duplicate of #4876. This is a new behaviour introduced with the last release of urllib3 where 0 no longer denotes "any size". I believe this is probably a bug since it doesn't make sense to iterate in chunks of 0.

In regards to the PR, I don't think we want to handle the issue with an exception. We handle 0 correctly in iter_slices and preventing that would further break the API expectations. We'll track next steps in #4876 for now.

/cc @sethmlarson

nateprewitt avatar Dec 28 '18 20:12 nateprewitt

@nateprewitt That makes sense, thanks for the quick review/reply!

alex-wenzel avatar Dec 28 '18 20:12 alex-wenzel

I'll investigate, thanks for the report.

sethmlarson avatar Dec 28 '18 20:12 sethmlarson

What I'm seeing so far is that urllib3 hasn't (in recent history, tested urllib3 1.18 to 1.24.1) not-blocked indefinitely when using amt=0 into HTTPResponse.stream(). Is this issue different from the one described into #4876 which appears to be about amt=None?

sethmlarson avatar Dec 28 '18 21:12 sethmlarson

@sethmlarson after a closer look, it is not. None does appear to work correctly with all of the combinations I tried today. It looks like Requests has never supported this actually, at least not since 2.0.

@alex-wenzel I'm going to reopen this since I agree we can do a better job here. Part of me would be more inclined to treat 0 the same way we treat None. That would require special-casing logic for 0->None but the alternative of requiring everyone implement this themselves (len('')-> None) seems silly. The case of amt<0 is already handled by the standard library. @sigmavirus24, @sethmlarson, any thoughts on that?

nateprewitt avatar Dec 30 '18 23:12 nateprewitt

What do other streaming interfaces typically do with a chunk size of zero? Any examples in the stdlib? I'd be inclined to copy whatever is done there.

sethmlarson avatar Dec 31 '18 00:12 sethmlarson

For what it's worth, the code I was debugging includes:

file_size = int(r.headers['Content-Length'])
chunk_size = int(file_size / 50)

...so files smaller than 50 bytes would hang forever. I have to imagine this would be one of the most common reasons a user would inadvertently set chunk_size=0 and in this case I think having it behave like None would be reasonable.

Also, slightly tangential to this thread, but it looks like my most recent changes failed in AppVeyor with an ImportError in Python 3.7 specifically that doesn't look related to my changes.

alex-wenzel avatar Dec 31 '18 01:12 alex-wenzel

I took a look at getting this added into 2.27.0 but it has some unfortunate side effects. Non-streamed requests currently succeed at returning data from iter_content for non-positive integers. While I agree that's nonsensical, adding this behavior breaks that workflow, which we won't be able to do in a minor release.

For the time being, this is an unfortunate behavior which we could definitely document for iter_content. We won't be able to merge this PR until it's reworked to maintain current behavior though. Unfortunately, I don't see an immediate way to do that.

nateprewitt avatar Dec 29 '21 03:12 nateprewitt