torchtune icon indicating copy to clipboard operation
torchtune copied to clipboard

Any scripts to convert .pt format of checkpoint to hf model?

Open Vattikondadheeraj opened this issue 1 year ago • 3 comments

Hey, are there any readily available scripts to convert .pt checkpoints to hf? I am facing difficulties in that and its emergency. Please share the script if any. Thank you

Vattikondadheeraj avatar Oct 02 '24 15:10 Vattikondadheeraj

Did you finetune your model using the FullModelHFCheckpointer?

If so, the .pt suffix can just be renamed to .bin and things should work. Going a step further, you can use HF's own script for converting weights to the safetensor format, which is their preferred format.

joecummings avatar Oct 02 '24 15:10 joecummings

@joecummings , Hey I am finetuning model using FullModelMetaCheckpointer. Is there any way I can convert to the hf model? Does the above method work?

Vattikondadheeraj avatar Oct 06 '24 15:10 Vattikondadheeraj

@Vattikondadheeraj currently the process to do this is a bit manual. If you have a state dict in the Meta format it can be converted to the HF format via two conversions: first Meta format to torchtune format, then torchtune format to HF format. One suggestion is to cobble together a script where you:

  1. Create an instance of FullModelMetaCheckpointer and load your Meta-format checkpoint into it via load_checkpoint. This should give you a state dict in the torchtune format.
  2. Create an instance of FullModelHFCheckpointer. Then you can call save_checkpoint on this with the state dict you loaded in (1).

We are planning on a refactor of the checkpointer to make this process a bit easier, so please stay tuned for that as well.

ebsmothers avatar Oct 07 '24 18:10 ebsmothers