【求助!!!】使用ColossalAI训练出来的模型效果很差,不知道是什么原因?
我用ColossalAI训练GPT2模型(参考示例:https://github.com/hpcaitech/ColossalAI-Examples/tree/main/language/gpt ),训练出来的效果很差,用模型推理出来的都是逗号、句号、“的”、“是”、“我”、“你”、“这”等等语言中出现概率最高的词,中英文的模型我都训练过,都是这样。不知道是模型训练有问题,还是推理方法有问题?
我的核心推理代码如下:
加载模型(模型训练周期Epoch=25)
model = gpt2_small() checkpoint = torch.load("colossalai_model.pt") model_state = checkpoint['model'] model.load_state_dict(model_state, strict=TRUE) model.eval()
推理过程
indexed_tokens = [6759, 487, 17, 34, 11] # [6759, 487, 17, 34, 11] 对应的文本是“病情分析:你的” tokens_tensor = torch.tensor([indexed_tokens]) outputs = model(tokens_tensor) _, top_ix = torch.topk(outputs[0], k=10) print(top_ix)
推理结果:
tensor([[ 16, 11, 12, 5, 22, 7, 69, 34, 17, 26], [ 16, 11, 12, 5, 487, 17, 7, 34, 22, 1840], [ 16, 11, 12, 5, 22, 7, 17, 34, 32, 487], [ 16, 11, 12, 34, 22, 69, 5, 17, 7, 32], [ 16, 11, 12, 5, 22, 17, 34, 69, 7, 1840]])
对应词典:
5------< pad> 7------< eod> 11------的 12------。 16------, 17------: 22------是 26------有 32------这 34------你 69------可以 487------分析 1840------意见 6759------病情
这个问题困扰了我两个多月,一直无法用ColossalAI训练出能够正常推理的模型,中英文语料都尝试过,推理出来的结果都是语言中出现概率最高的字符。求大神帮我解惑,拜托!!!!!!
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Title: 【Help! ! ! 】The model trained with ColossalAI is very poor. I don’t know why?
Hi @recool08 , we are evaluating the training quality of our GPT2 models. Results are coming as soon as possible. Please keep an eye on our updates.
Hi @recool08 , we are evaluating the training quality of our GPT2 models. Results are coming as soon as possible. Please keep an eye on our updates.
Thanks!!!
Do you have solved this problem?
@recool08 你有找到问题吗? @kurisusnowdeng do you have report your results on chatgpt training
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
@recool08 did you find the problem?
Hi @Syno8, we are still developing our chatgpt training by following the released information from OpenAI. We do not have many results so far, but hopefully we could guarantee the training quality of ColossalAI's chatgpt as soon as possible.