soft-prompt-tuning icon indicating copy to clipboard operation
soft-prompt-tuning copied to clipboard

Can't train against GPT-J

Open rezaalavi opened this issue 2 years ago • 2 comments

I am having trouble running your code to train GPT-J

I can't seem to find the proper value for model_name_or_path and base_model_path. I have tried EleutherAI/gpt-j-6B for either of them but the program errors out.

Can you please provide the sample command line arguments for vanilla GPT-J?

rezaalavi avatar Dec 27 '22 04:12 rezaalavi

in train.sh you will find this --model_name_or_path "" - it's dir where we store pretreined prefix tensors, if this field is empty - it will init new prefix

--base_model_path "/export/data/gptj/j6b_ckpt" - here is path to GPT-J checkpoint. When I did this scripts, this model still have not been distributed through Huggingface, and was only exposed on the EleutherAI share folder. Seems this is: https://mystic.the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd or you can try full checkpoint: https://mystic.the-eye.eu/public/AI/GPT-J-6B/step_383500.tar.zstd

Better this code should be rewrited to load model through HF, but currently it's going through this function: https://github.com/exelents/soft-prompt-tuning/blob/main/gptj_utils.py#L23

exelents avatar Dec 27 '22 18:12 exelents

Fixed in https://github.com/exelents/soft-prompt-tuning/pull/3

slush0 avatar Mar 08 '23 06:03 slush0