emg-toolkit
emg-toolkit copied to clipboard
Add an "-resume" option for the bulk_downloader
MGnify has some very large studies, downloading those is problematic. With the current implementation if there is a network issue there is no way to restart the download process using the files already downloaded.
This feature will require (this is just a brain dump)
- Store the tool progress status in a .sqlite db or a text file (pages and the download status for each page, how many pages...)
- Add a "--resume" flag or sniff at the results folder before starting downloading data
- Use the state to start downloading from that point
- Check the files check the downloaded file checksums