chenboheng comments

Repositories
Issues
Comments

Results 2 comments of


                                            chenboheng

是不是chatglm与这个GLM-130b开源模型中间还有很多问题待解决？

这不是明显的嘛，开源权重只是预训练模型权重，后续还有指令微调，ppo等很多步骤，怎么可能只用预训练模型就得到好的问答

部署后报错 size mismatch for transformer.word_embeddings.weight: copying a param with shape torch.Size([18816, 12288]) from checkpoint, the shape in current model is torch.Size([150528, 12288]).

很明显的错误：using world size: 1 and model-parallel size: 8 你加载权重是1/8的权重，你实际定义模型是完整的维度，自然加载不了：size mismatch for transformer.word_embeddings.weight: copying a param with shape torch.Size([18816, 12288]) from checkpoint, the shape in current model is torch.Size([150528, 12288]).只加载了1/8的权重