stacks-blockchain-docker icon indicating copy to clipboard operation
stacks-blockchain-docker copied to clipboard

feat: skip downloading files if they already exist

Open pradel opened this issue 1 year ago • 2 comments

Description

During the seed process, something might go wrong when you insert things in the DB so you want to retry the process. Currently you will always redownload the required files even tho they are on your system. To speed up things we could download these files only if they don't exist. Wdyt of this approach @wileyj ?

Type of Change

  • Other

Does this introduce a breaking change?

I don't think so

Are documentation updates required?

No

pradel avatar Jul 04 '24 14:07 pradel

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Jul 04 '24 14:07 CLAassistant

rather than not download the file, i think it would be better to check the checksum of local vs remote before downloading. edit: typed "delete" when i meant "download"

wileyj avatar Jul 04 '24 18:07 wileyj

@wileyj updated the pr, now, it check the local checksum to decide if a download is required

pradel avatar Mar 11 '25 14:03 pradel

@wileyj updated the pr, now, it check the local checksum to decide if a download is required

the change looks like it would work, but i have a question about how this may be used. consider the last steps of the script: https://github.com/pradel/stacks-blockchain-docker/blob/feat/skip-download-file-if-already-exist/scripts/seed-chainstate.sh#L253-L265

is the intent to check the checksum if an archive file was downloaded separately, and then removed at end of the script execution?

i'm on the fence here, but i'm tempted to say we should keep the downloaded files (it's painful to redownload if you need them again).

pending your thoughts on that idea, this looks good to merge though - thanks!

wileyj avatar Mar 12 '25 00:03 wileyj

I tested the code on a new server to confirm things are working as intended. Checking the checksum files also helps to see if there is a new version available. In case the seed failed for some reason and you need to restart the process.

i'm on the fence here, but i'm tempted to say we should keep the downloaded files (it's painful to redownload if you need them again).

It's indeed painful (I actually had a lot of "connection reset by peers" errors while downloading the files, maybe we could add some options to curl like automatic retry and some other flags that could help here?) , but once the seed step is done you don't really need these files anymore right?

pradel avatar Mar 12 '25 08:03 pradel

Fair point - i'll think about this some more

wileyj avatar Mar 12 '25 12:03 wileyj