Tron1994
Tron1994
> I used the code in https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/train_gpt.py。 config.py : from colossalai.nn.optimizer import HybridAdam from colossalai.zero.shard_utils import TensorShardStrategy from titans.model.gpt import gpt2_small, gpt2_36B BATCH_SIZE = 2 NUM_EPOCHS = 60 SEQ_LEN =...
问题定位:Assertion ‘srcIndex < srcSelectDimSize‘ failed 原因:采用了yuan语料vocab.txt,和gpt默认vocab_size有出入,需要手动修改,可在配置gpt_zero3.py中修改: model = dict( type=gpt2_small, vocab_size=53227, checkpoint=True, ) 希望修复
发现teacher用chat模型比用base模型要好
有发现什么问题吗,我跑了很多实验,distil_loss也基本是持平的,非常微弱地下降  