llama icon indicating copy to clipboard operation
llama copied to clipboard

How to finetune llama checkpoint using metaseq?

Open ganeshjawahar opened this issue 1 year ago • 0 comments

I want to finetune the 7B llama checkpoint using metaseq. It seems the llama checkpoints are the consolidated versions of the model. It's not clear how to finetune consolidated model directly in metaseq. Is there a conversion utility to convert consolidated version to metaseq training compatible format?

Llama checkpoint dict keys (consolidated): dict_keys(['tok_embeddings.weight', 'norm.weight', 'output.weight', 'layers.0.attention.wq.weight', 'layers.0.attention.wk.weight', 'layers.0.attention.wv.weight', 'layers.0.attention.wo.weight', ....

OPT checkpoint dict keys (metaseq training compatible format): dict_keys(['model', 'args', 'cfg', 'criterion', 'optimizer_history', 'task_state', 'extra_state', 'shard_metadata']) Zooming into "model": dict_keys(['flat_param_0', 'decoder.layers.0.flat_param_0', 'decoder.layers.1.flat_param_0', 'decoder.layers.2.flat_param_0', ...

Thanks.

ganeshjawahar avatar Apr 05 '23 01:04 ganeshjawahar