soft-prompt-tuning
soft-prompt-tuning copied to clipboard
Can't train against GPT-J
I am having trouble running your code to train GPT-J
I can't seem to find the proper value for model_name_or_path and base_model_path. I have tried EleutherAI/gpt-j-6B for either of them but the program errors out.
Can you please provide the sample command line arguments for vanilla GPT-J?
in train.sh you will find this --model_name_or_path "" - it's dir where we store pretreined prefix tensors, if this field is empty - it will init new prefix
--base_model_path "/export/data/gptj/j6b_ckpt" - here is path to GPT-J checkpoint. When I did this scripts, this model still have not been distributed through Huggingface, and was only exposed on the EleutherAI share folder. Seems this is: https://mystic.the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd or you can try full checkpoint: https://mystic.the-eye.eu/public/AI/GPT-J-6B/step_383500.tar.zstd
Better this code should be rewrited to load model through HF, but currently it's going through this function: https://github.com/exelents/soft-prompt-tuning/blob/main/gptj_utils.py#L23
Fixed in https://github.com/exelents/soft-prompt-tuning/pull/3