ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

怎么做预训练

Open ljch2018 opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe.

No response

Solutions

https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E8%84%9A%E6%9C%AC

Additional context

No response

ljch2018 avatar May 20 '23 08:05 ljch2018

image 是否可以提供类似的预训练脚本?

ljch2018 avatar May 20 '23 08:05 ljch2018

同问

SupritYoung avatar May 27 '23 09:05 SupritYoung

Pretrain Run the following script to pre-train the GLM-Large model

bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.

The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.

tomcat123a avatar May 29 '23 10:05 tomcat123a

Pretrain Run the following script to pre-train the GLM-Large model

bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.

The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.

这个脚本可以做增量二次预训练吗?谢谢。

neverstoplearn avatar Jun 01 '23 07:06 neverstoplearn

mark

MikeHollyWong avatar Jun 02 '23 02:06 MikeHollyWong

mark

zhaoying9105 avatar Jun 17 '23 03:06 zhaoying9105

mark

assmdx avatar Jul 14 '23 04:07 assmdx

https://github.com/shibing624/MedicalGPT/blob/main/pretraining.py 这个是chatglm6b的预训练。

tomcat123a avatar Jul 20 '23 02:07 tomcat123a

https://github.com/shibing624/MedicalGPT 参考这个项目,预训练,指令微调,rm模型训练,ppo都有现成的

tomcat123a avatar Jul 20 '23 03:07 tomcat123a