torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

`tune download` doesn't download the weights

Open optimass opened this issue 10 months ago • 2 comments

$ tune download meta-llama/Meta-Llama-3-8B-Instruct 
Ignoring files matching the following patterns: *.safetensors
Successfully downloaded model repo and wrote to the following locations:
/home/toolkit/ui-copilot/model/.gitattributes
/home/toolkit/ui-copilot/model/LICENSE
/home/toolkit/ui-copilot/model/README.md
/home/toolkit/ui-copilot/model/USE_POLICY.md
/home/toolkit/ui-copilot/model/config.json
/home/toolkit/ui-copilot/model/generation_config.json
/home/toolkit/ui-copilot/model/model.safetensors.index.json
/home/toolkit/ui-copilot/model/original
/home/toolkit/ui-copilot/model/special_tokens_map.json
/home/toolkit/ui-copilot/model/tokenizer_config.json
/home/toolkit/ui-copilot/model/tokenizer.json

Why is it ignoring the safetensors, ie the weights

optimass avatar Apr 23 '24 20:04 optimass

Thanks for filing this issue @optimass!

Currently, tune download defaults to ignoring safetensors: https://github.com/pytorch/torchtune/blob/bec7babec9c924a0ee7ad27e3f6582bc5bd1fef5/torchtune/_cli/download.py#L96, though we could probably do a better job of documenting this.

In this case, this is fine since the original model checkpoints still exist in the model/original subdirectory which does get downloaded. In this directory you should see a file like consolidated.00.pth which contains the model checkpoint.

Our current llama3 recipes default to the FullModelMetaCheckpointer (see example: https://github.com/pytorch/torchtune/blob/main/recipes/configs/llama3/8B_full.yaml), so once you've pointed your checkpoint files correctly (see our tutorial for help: https://pytorch.org/torchtune/stable/tutorials/llama3.html) this should all work out of the box.

Let us know if you have any more questions!

cc @joecummings

rohan-varma avatar Apr 23 '24 22:04 rohan-varma

To add on to this, we have stronger testing for PyTorch-native checkpoints, therefore we usually prefer downloading those. And to preserve space on your machine, we don't want to download multiple formats of the same checkpoint, hence the default is to ignore safetensors.

However, we do fully support safetensor loading so you can download those for Llama3 8B Instruct with the following command: tune download meta-llama/Meta-Llama-3-8B-Instruct --ignore-patterns ""

This will download everything in the repository so make sure you have enough space!

Like @rohan-varma said, if you do use the safetensor format, just make sure to update the configs:

$ tune cp llama3/8B_full_single_device .

Then you can add the following lines in your local config:

checkpointer:
  _component_: torchtune.utils.FullModelHFCheckpointer
  checkpoint_dir: <checkpoint-dir>
  checkpoint_files: [
    model-00001-of-00004.safetensors,
    model-00002-of-00004.safetensors,
	model-00003-of-00004.safetensors,
	model-00004-of-00004.safetensors,
  ]
  output_dir: ${checkpointer.checkpoint_dir}
  model_type: LLAMA3

Hope this helps!

joecummings avatar Apr 23 '24 22:04 joecummings

Thanks for you help! Ok I got confused w/ the safetensors. Indeed, the weights are in mode/original, thanks!

optimass avatar Apr 24 '24 12:04 optimass