ChatGLM-6B
ChatGLM-6B copied to clipboard
怎么做预训练
Is your feature request related to a problem? Please describe.
No response
Solutions
https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E8%84%9A%E6%9C%AC
Additional context
No response
是否可以提供类似的预训练脚本?
同问
Pretrain Run the following script to pre-train the GLM-Large model
bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.
The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.
Pretrain Run the following script to pre-train the GLM-Large model
bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.
The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.
这个脚本可以做增量二次预训练吗?谢谢。
mark
mark
mark
https://github.com/shibing624/MedicalGPT/blob/main/pretraining.py 这个是chatglm6b的预训练。
https://github.com/shibing624/MedicalGPT 参考这个项目,预训练,指令微调,rm模型训练,ppo都有现成的