Knover
Knover copied to clipboard
Fine-tune PLATO-2
-
I download the 24L model, and run the finetune script
bash ./scripts/local/job.sh ./projects/PLATO-2/finetune/24L_train.conf
. I got nan for my loss at the very beginning of the fine-tuning. Am I missing any stages? -
If I do the pre-train script, the pretrain stage 1 does not store anything in the output/. I assume stage 2.1, and stage 2.2, requires stage 1's output right? How do I store stage 1? Thanks
Thanks!
You can change: AMP setting in knover/core/model.py https://github.com/PaddlePaddle/Knover/blame/develop/knover/core/model.py#L165
"custom_white_list": ["gelu"],
It seems that old models need to disable fp16 softmax / layer_norm. Thanks for feedback!
As for your second question, I think the pre-training data is too small and the number of steps saved is too large. As a result, the training ends before the set number of steps is reached. Save_steps can be modified in/"projects/ PLATe-2 /pretrain/ 24l_train_stage-1.conf"