ChatGLM-6B 怎么做预训练

Is your feature request related to a problem? Please describe.

No response

Solutions

https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E8%84%9A%E6%9C%AC

Additional context

No response

May 20 '23 08:05 ljch2018

是否可以提供类似的预训练脚本？

May 20 '23 08:05 ljch2018

同问

May 27 '23 09:05 SupritYoung

Pretrain Run the following script to pre-train the GLM-Large model

bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.

The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.

May 29 '23 10:05 tomcat123a

Pretrain Run the following script to pre-train the GLM-Large model

bash scripts/ds_pretrain_nvidia.sh config/ds_block_large.sh The script scripts/ds_pretrain_nvidia.sh launches the training program with DeepSpeed. You should change NUM_WORKERS and NUM_GPUS_PER_WORKER to the number of workers and the number of gpus per worker. Also change HOST_FILE_PATH to the path to an OpenMPI-style hostfile. More details about DeepSpeed launcher can be found here.

The file config/ds_block_large.sh defines the hyperparameters for pretraining. Most of the arguments are fairly self-explanatory. Specifically, --train-data can be multiple keywords defined in NAMED_CORPORA in data_utils/corpora.py. The hyperparameters of the optimizer are defined in the corresponding json file under config. The semantics of the json file can be found here.

这个脚本可以做增量二次预训练吗？谢谢。