litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Continue downloads

Open surak opened this issue 2 years ago • 3 comments

I have all the weights from other experiments. The only difference seems to be a "coreml" directory between the download script and the models I already have. For example, falcon-7b-instruct:

md5sum  checkpoints/tiiuae/falcon-7b-instruct/* ; md5sum ../text-generation-webui/models/tiiuae_falcon-7b-instruct/*
b4a4d85458aa9e629d239738014cbd6b  checkpoints/tiiuae/falcon-7b-instruct/config.json
8c0d86e4eb3a35abbe4ba0a10ece38a2  checkpoints/tiiuae/falcon-7b-instruct/configuration_RW.py
md5sum: checkpoints/tiiuae/falcon-7b-instruct/coreml: Is a directory
979aa479eb1bda178b8fdc7b34e5ad19  checkpoints/tiiuae/falcon-7b-instruct/generation_config.json
e23c34b2efc3db3b9b96a276a14412ba  checkpoints/tiiuae/falcon-7b-instruct/handler.py
65fb46b2d3e839e1925f2bb65f5019b0  checkpoints/tiiuae/falcon-7b-instruct/modelling_RW.py
88a96a5685c56d4d717524b4ced8e230  checkpoints/tiiuae/falcon-7b-instruct/pytorch_model-00001-of-00002.bin
22cf56a6e7f46e579df3b38f09c6548d  checkpoints/tiiuae/falcon-7b-instruct/pytorch_model-00002-of-00002.bin
06daf8a723cde3c1e4183ab22f5e010b  checkpoints/tiiuae/falcon-7b-instruct/pytorch_model.bin.index.json
e2174dbf6759db708ad2df133c989487  checkpoints/tiiuae/falcon-7b-instruct/README.md
c5a08caeff2260fbde15b6e55ad672da  checkpoints/tiiuae/falcon-7b-instruct/special_tokens_map.json
a68a0d2cfcdacbbfeeffb63e47cf8bbf  checkpoints/tiiuae/falcon-7b-instruct/tokenizer_config.json
a00433e855eadadaf8621597c8191287  checkpoints/tiiuae/falcon-7b-instruct/tokenizer.json
b4a4d85458aa9e629d239738014cbd6b  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/config.json
8c0d86e4eb3a35abbe4ba0a10ece38a2  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/configuration_RW.py
979aa479eb1bda178b8fdc7b34e5ad19  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/generation_config.json
e23c34b2efc3db3b9b96a276a14412ba  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/handler.py
162ac4e73bbe61eaa6a82bcc19349b50  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/huggingface-metadata.txt
65fb46b2d3e839e1925f2bb65f5019b0  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/modelling_RW.py
88a96a5685c56d4d717524b4ced8e230  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/pytorch_model-00001-of-00002.bin
22cf56a6e7f46e579df3b38f09c6548d  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/pytorch_model-00002-of-00002.bin
06daf8a723cde3c1e4183ab22f5e010b  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/pytorch_model.bin.index.json
e2174dbf6759db708ad2df133c989487  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/README.md
c5a08caeff2260fbde15b6e55ad672da  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/special_tokens_map.json
a68a0d2cfcdacbbfeeffb63e47cf8bbf  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/tokenizer_config.json
a00433e855eadadaf8621597c8191287  ../text-generation-webui/models/tiiuae_falcon-7b-instruct/tokenizer.json

If I try to simply complete this directory, it will start again.

surak avatar Jun 18 '23 15:06 surak

If you had the weights (*.bin files) already downloaded at checkpoints/tiiuae/falcon-7b-instruct/coreml, you can do

python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/tiiuae/falcon-7b-instruct/coreml --model_name falcon-7b-instruct

carmocca avatar Jun 19 '23 10:06 carmocca

Thanks, that seems to work. Should I make a merge request on the documentation to clarify that?

But this doesn't answer the issue, which is that of continuing downloads. I can close it if no one find it relevant (it is where internet is slow, but those days...)

surak avatar Jun 19 '23 12:06 surak

Thanks, that seems to work. Should I make a merge request on the documentation to clarify that?

If you think that would be helpful for others, then go for it. Note that you wouldn't need to do this if downloading the weights as described in the howtos.

which is that of continuing downloads

The script has resume_download=True set https://github.com/Lightning-AI/lit-parrot/blob/main/scripts/download.py#L23. If that's not working, then I would ask around in the huggingface_hub repository: https://github.com/huggingface/huggingface_hub

carmocca avatar Jun 19 '23 18:06 carmocca