PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

跑GPT模型时 训练参数如何设置

Open syy-love opened this issue 3 years ago • 1 comments

地址:https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/gpt 文档训练参数: CUDA_VISIBLE_DEVICES=0 python run_pretrain.py
--model_type gpt
--model_name_or_path gpt2-en
--input_dir "./data"
--output_dir "output"
--weight_decay 0.01
--grad_clip 1.0
--max_steps 500000
--save_steps 100000
--decay_steps 320000
--warmup_rate 0.01
--micro_batch_size 4
--device gpu

大佬们知道这些参数都是怎么算的吗 比如数据是1万条 这些参数如何自定义 有相关文档吗

syy-love avatar Jul 26 '22 03:07 syy-love

算token数目。 比如这里的配置 max_steps * seq_len * batch_size = total_tokens 500000 * 1024* 4=2B 约20亿的token。 假设1w条数据的平均长度是 256,则语料token数为 "10000 * 256 = 256w"

除一下,500000 * 1024* 4 / (10000 * 256 ) = 800, 相当于跑了 800个epoch,这个训练量确实算很大,可以设置 5w step试试,只跑80个epoch

ZHUI avatar Jul 26 '22 12:07 ZHUI

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] avatar Dec 08 '22 06:12 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

github-actions[bot] avatar Dec 22 '22 16:12 github-actions[bot]