llama How to finetune llama checkpoint using metaseq?

How to finetune llama checkpoint using metaseq?

Open ganeshjawahar opened this issue 1 year ago • 0 comments

I want to finetune the 7B llama checkpoint using metaseq. It seems the llama checkpoints are the consolidated versions of the model. It's not clear how to finetune consolidated model directly in metaseq. Is there a conversion utility to convert consolidated version to metaseq training compatible format?

Llama checkpoint dict keys (consolidated): dict_keys(['tok_embeddings.weight', 'norm.weight', 'output.weight', 'layers.0.attention.wq.weight', 'layers.0.attention.wk.weight', 'layers.0.attention.wv.weight', 'layers.0.attention.wo.weight', ....

OPT checkpoint dict keys (metaseq training compatible format): dict_keys(['model', 'args', 'cfg', 'criterion', 'optimizer_history', 'task_state', 'extra_state', 'shard_metadata']) Zooming into "model": dict_keys(['flat_param_0', 'decoder.layers.0.flat_param_0', 'decoder.layers.1.flat_param_0', 'decoder.layers.2.flat_param_0', ...

Thanks.

Apr 05 '23 01:04 ganeshjawahar

llama llama copied to clipboard

How to finetune llama checkpoint using metaseq?

llama
llama copied to clipboard