gobbli
gobbli copied to clipboard
Harden file download function
Feature
Implement some functionality to make downloading and caching files more robust.
Motivation
There are some potential pitfalls related to filename collisions, partial downloads, and bookkeeping with the current implementation.
Additional Details
- Hash each URL, store the actual file under that directory (to prevent issues when files have the same name from different URLs)
- Store metadata file with the downloaded file to indicate when it was downloaded successfully
- Save to temp file and rename to the final file after download is completely to avoid partially downloaded files
- Maybe attempt downloads multiple times in the case of flaky connections