Handle 416 status code by deleting the local file and retrying
A server returns 416 if the range we've provided to it is invalid. Whilst the local partially downloaded file exists, any attempt to retry using the same range is guaranteed to fail (unless the file is replaced on the server).
The only way to resolve the issue is to send a valid range header (unknown at this point) or not send one at all.
This patch, in response to a 416 status code, seamlessly deletes the local partially downloaded file and retries the file download. Because the local file has been deleted the retry attempt will not include a range header and hence 416 won't be returned again i.e. no infinite retrying.
If for some odd reason the local file can't be deleted then we gracefully fallback to the previous behaviour of calling the error callback with the 416 status code error.
@thibaultCha Whoops, how rude of me! I forgot...
Thanks for all the work you've put into this library, it's a breeze to use!
I think that makes sense. Thank you!
@thibaultCha I've made the suggested change to the if statement.
Did you try this?
I have tested this in several cases using Amazon S3. The simplest real-world case in which this occurs is when downloading from Amazon S3 and the local file has a file-size of zero bytes. Amazon S3 rejects range headers of the form bytes=count-, where count is the entire file size. This particular circumstance could be accounted for more efficiently by not sending the range header if the local file is 0 bytes. However, this patch handles a wider variety of situations and seemed like the path of least resistance.
The simplest way to test this would be to upload a ~10 MiB file to a server and allow part of the download to complete before cutting off the connection. Then replace the file on S3 (same URL) with a much smaller file 1 KiB etc. and retry the download. Now the range supplied will be larger than the file on the server and hence you'll get 416. Admittedly, that's not a very "real world" test, but it is simple to reproduce.
How does the NSOperation behavies when calling start from an already running one? I think the right pattern would be to stop this NSOperation, and start a new one.
I'm not quite sure that stopping the operation and creating a new one is quite correct in this circumstance. The TCBlobDownloader operation is responsible for downloading a file, as opposed to simply performing a HTTP request. So when retrying in response to a 416 status code the operation hasn't actually finished (failed or succeeded) yet i.e. we don't want to call any callbacks indicating the operation failed (because it hasn't).
However, whilst calling start again does work, you're correct in that it shouldn't really be done:
You can call this method explicitly if you want to execute your operations manually. However, it is a programmer error to call this method on an operation object that is already in an operation queue or to queue the operation after calling this method. Once you add an operation object to a queue, the queue assumes all responsibility for it.
That comment in the Apple docs is mostly referring to default start implementation, as in this case there are no adverse effects in calling start again. Nonetheless, in the spirit of preserving NSOperation semantics I have updated the pull request and refactored start into a separate method called startDownload, which is now called when retrying and from `start``. I've also made sure KVC callbacks aren't called more than once
You should also report the error to the delegate or the error block (see notifyFromError:) ;)
If the operation had failed then I'd definitely agree. However, hopefully my logic above (and below) explains why I haven't done so in all circumstances.
I wrote the following in the pull request description:
If for some odd reason the local file can't be deleted then we gracefully fallback to the previous behaviour of calling the error callback with the 416 status code error.
Explaining that in more detail, I meant - if the download is retrying after receiving a 416, then the operation hasn't yet failed - so the error callback is not called. However, if a 416 is received and for some reason the file fails to delete, then the file error will be logged and then the original (non-416) code-path will be run (which results in the error callback being called with the 416 error). In this circumstance I did consider calling the error callback with the file error instead, however I felt a consumer of the API would be unclear as to why the file was even being deleted in the first place. Realistically failure to delete the file shouldn't really occur anyway.
I've attempted to adopt the coding style of the project, please let me know if there's something I missed.
Whoops, published the comment explaining the code before the code itself was pushed.
The updated code should be all there now.
Hi, sorry for the big delay! Could you please rebase? I would be willing to merge this. Thank you :)
Sure, I think I'm a bit too busy to do it today. However, if I haven't done this by mid next week please ping me again.