ChatGLM-6B 关于基于 ChatGLM-6B做增量预训练

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

您好，我这边在尝试基于 ChatGLM-6B在领域数据上先做自监督增量预训练，然后再做指示微调。有几个问题想请教下您，望赐教： 1.您认为此方案可行性如何，基于ChatGLM-6B再做自监督预训练是否会严重损害之前获得的能力。 2.目前我采用的方案是完全基于 ChatGLM-6B代码做自监督预训练，仅能实现根据上文预测下文和问答对指示学习的任务，您认为这么做可行性如何呢。 3.预训练 GLM-6B和预训练GLM-130B所取的学习率、优化器参数等超参差别大吗，是否能开放相关资料学习呢。感谢！

Expected Behavior

No response

Steps To Reproduce

不需要

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Jun 02 '23 06:06 MikeHollyWong

同问，用自己的语料库做自监督预训练会容易有灾难性遗忘的问题吗？

Jun 07 '23 12:06 UknowSth

同问，用自己的语料库做自监督预训练会容易有灾难性遗忘的问题吗？

求教程，怎么用chatglm-6B训练自己的语料库，我想训练下自己找的医疗的语料库

Jun 08 '23 01:06 chengzhen123

遗忘，多半是你的优化器和学习率的问题

Jun 08 '23 03:06 cywjava

能否进行增量训练有结论了没

Jun 26 '23 15:06 lwm345

Is there an existing issue for this?

[x] I have searched the existing issues

Current Behavior

您好，我这边在尝试基于 ChatGLM-6B在领域数据上先做自监督增量预训练，然后再做指示微调。有几个问题想请教下您，望赐教： 1.您认为此方案可行性如何，基于ChatGLM-6B再做自监督预训练是否会严重损害之前获得的能力。 2.目前我采用的方案是完全基于 ChatGLM-6B代码做自监督预训练，仅能实现根据上文预测下文和问答对指示学习的任务，您认为这么做可行性如何呢。 3.预训练 GLM-6B和预训练GLM-130B所取的学习率、优化器参数等超参差别大吗，是否能开放相关资料学习呢。感谢！

Expected Behavior

No response

Steps To Reproduce

不需要

Environment
- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?

No response

请问有教程可以分享吗?

Jul 12 '23 08:07 cxjtju

请问怎么做增量预训练呢？

Jul 14 '23 04:07 assmdx

我最近也想做这方面的东西，大佬们有什么方法吗

Jul 15 '23 04:07 RuiNov1st

https://github.com/shibing624/MedicalGPT 参考这个项目，预训练，指令微调，rm模型训练，ppo都有现成的。

Jul 20 '23 03:07 tomcat123a

https://github.com/shibing624/MedicalGPT 参考这个项目，预训练，指令微调，rm模型训练，ppo都有现成的。

请问做增量预训练，输入文本最大长度是2048吗？超过这个长度怎么办？

Jul 22 '23 13:07 cxjtju

ChatGLM-6B ChatGLM-6B copied to clipboard

关于基于 ChatGLM-6B做增量预训练

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard