coralnet
coralnet copied to clipboard
Image downloads: Using Dropbox to manage very large downloads
The save_url
endpoint of the Dropbox API is probably what we're after. This lets the Dropbox API download from arbitrary URLs to a Dropbox folder.
https://blogs.dropbox.com/developers/2015/06/programmatically-saving-a-url-to-dropbox/
https://dropbox.github.io/dropbox-api-v2-explorer/#files_save_url
This involves getting access to the user's Dropbox account via OAuth2, so we can make the API calls.
We have to specify a path in the user's Dropbox to which the image will be saved.
- Assuming an image name of
IMG_2958.jpg
, the save path could just beIMG_2958.jpg
or perhapscoralnet_downloads/IMG_2958.jpg
. - Assuming an image name of
LTER1/IMG_2958.jpg
, the save path could becoralnet_downloads/LTER1/IMG_2958.jpg
. This means that uploading an entire folder tree (#119) and then downloading it should get the same folder tree back. - If a filepath already exists, don't overwrite it, and let the user know. There may be other download errors such as the Dropbox account not having enough space, so be prepared to catch and report errors in general.
NOTE: save_url
only takes one URL, and as far as I can tell there is no way to queue downloads via the API. So, quite unfortunately, the user must leave their browser open on this page while the downloads march on.
The API calls would go something like this:
-
save_url
image 1 -
save_url_job
image 1 repeatedly (every 0.5s maybe?), until the job is complete -
save_url
image 2 - ...
To track which images have yet to be downloaded, perhaps the initial Ajax call starting the batch download could save a session variable containing the IDs of images pending download.
Worth noting that the django-storages
app has Dropbox API support, so maybe that could be useful?