LocalAI
LocalAI copied to clipboard
feat(dowloader): resume partial downloads
Description
This PR adds support for resuming partial downloads.
- [x] Resuming partial downloads
- [x] Verifying checksum
- [x] Reflecting download percentage on front end after resuming download
Notes for Reviewers A few questions:
- Should we check if the server supports Range Header(used here for partial downloads), most modern web servers do.
- I wanted to have a discussion when a model in gallery has multiple files.
Signed commits
- [x] Yes, I signed my commits.
Deploy Preview for localai ready!
| Name | Link |
|---|---|
| Latest commit | a17c9e3718e8546b7aa982f6773d48423e2e12bf |
| Latest deploy log | https://app.netlify.com/sites/localai/deploys/677ebd96baafa10008f8477d |
| Deploy Preview | https://deploy-preview-4537--localai.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Hi @mudler, can you have a look at this PR and the notes for reviewers section. Thanks
Hi @mudler, can you have a look at this PR and the notes for reviewers section. Thanks
Sure! will try to review today, thanks for the PR.
1. Should we check if the server supports Range Header(used here for partial downloads), most modern web servers do.
I think we probably should - mainly because otherwise if a web server does not support it we leave a dangling .partial file which can't be removed, and the model can't be installed anymore
2. I wanted to have a discussion when a model in gallery has multiple files.
In this case it should use the same underlying download function, so partial files would be recovered as well by this changeset, or am I missing anything?
the changeset looks good to me. My only open point is about detecting if a webserver doesn't support streaming files from a certain point, otherwise we should find another way to remove the .partials when we can't resume for some reason and start a fresh download instead (e.g. the partial data is invalid, or the webserver does not support it, or there are other relevant errors to the download)
I think we probably should - mainly because otherwise if a web server does not support it we leave a dangling
.partialfile which can't be removed, and the model can't be installed anymore
the changeset looks good to me. My only open point is about detecting if a webserver doesn't support streaming files from a certain point, otherwise we should find another way to remove the
.partialswhen we can't resume for some reason and start a fresh download instead (e.g. the partial data is invalid, or the webserver does not support it, or there are other relevant errors to the download)
Yes, there will be the issue with dangling .partial file, I was unsure weather we should delete the partial file automatically or let users delete them manually. The way I am thinking of implementing this is sending a HEAD request and checking if there is presence of Accept-Ranges header in the response.
In this case it should use the same underlying download function, so partial files would be recovered as well by this changeset, or am I missing anything?
There is a scenario that might pose a problem, though I do not know how common it would be.
Suppose we have a model in gallery that has 2 files(let's say, SomeSupportingBinary, ModelA) . And one of them(SomeSupportingBinary) got downloaded successfully, while the other one was a partial download(ModelA.partial). Now by the time we had resumed the download, the ModelA in the web server got updated. On a user's system when they resume the download, after completion, the downloaded ModelA will not match the corresponding SHA in _gallery_MODELA.yml and hence it will be deleted.
But here lies the issue that the SomeSupportingBinary will still be in the user's file system. This is one of the scenarios, there are a few others as well
I think we probably should - mainly because otherwise if a web server does not support it we leave a dangling
.partialfile which can't be removed, and the model can't be installed anymorethe changeset looks good to me. My only open point is about detecting if a webserver doesn't support streaming files from a certain point, otherwise we should find another way to remove the
.partialswhen we can't resume for some reason and start a fresh download instead (e.g. the partial data is invalid, or the webserver does not support it, or there are other relevant errors to the download)Yes, there will be the issue with dangling
.partialfile, I was unsure weather we should delete thepartialfile automatically or let users delete them manually. The way I am thinking of implementing this is sending aHEADrequest and checking if there is presence ofAccept-Rangesheader in the response.
Cool, that actually looks a sane approach
In this case it should use the same underlying download function, so partial files would be recovered as well by this changeset, or am I missing anything?
There is a scenario that might pose a problem, though I do not know how common it would be. Suppose we have a model in gallery that has 2 files(let's say,
SomeSupportingBinary,ModelA) . And one of them(SomeSupportingBinary) got downloaded successfully, while the other one was a partial download(ModelA.partial). Now by the time we had resumed the download, theModelAin the web server got updated. On a user's system when they resume the download, after completion, the downloadedModelAwill not match the corresponding SHA in_gallery_MODELA.ymland hence it will be deleted.But here lies the issue that the
SomeSupportingBinarywill still be in the user's file system. This is one of the scenarios, there are a few others as well
Gotcha, however it would be similar to the case if there is another failure (e.g. the server returns a 500) and not strictly related to the changes introduced in this PR.
I don't think should be part of this PR as well but rather be a follow-up to fix these behaviors - in the worse case there would be a file dangling (with the correct checksums), which would pass checks if the user would try to re-install the model (and deleting it), but that was the case before your changes.
Hi, thanks for your feedback and sorry for the delayed response.
I don't think should be part of this PR as well but rather be a follow-up to fix these behaviors - in the worse case there would be a file dangling (with the correct checksums), which would pass checks if the user would try to re-install the model (and deleting it), but that was the case before your changes.
Understood!
I'll work on implementing the server check.
Thanks!
Hi I've implemented the server check and a test for the same.
Thanks!
Thanks for the review, looking to contribute in other areas of the repository!