CLUEPretrainedModels issues

Results 15 CLUEPretrainedModels issues

Sort by recently updated

transformers中使用clue/roberta_chinese_pair_tiny的疑问

我通过transformers使用roberta_chinese_pair_tiny，提示以下warning 1. You are using a model of type roberta to instantiate a model of type bert. This is not supported for all configurations of models and can yield errors....

zy614582280

继续 Pretraining 的问题

你好，我尝试用自己的语料对 Roberta-tiny-clue 模型进行预训练，但是加载模型的时候报错 ``` tensorflow.python.framework.errors_impl.NotFoundError: Key cls/predictions/transform/dense/kernel not found in checkpoint ``` 看样子是 mlm 这一层的参数没有保留导致的，所以想确认一下，如果可以是否可以将 mlm 层的参数也公开？万分感谢。

Jhangsy

你好作者，论文里面提出来一种新的注意力机制，能否给下代码，写的太抽象了，无法理解

zhaolulul

预训练时的max_Seq_length

请问，你们的模型预训练时的数据长度还是256？假如我要继续预训练，我的max_Seq_length需要和您的模型一样，还是可以设置为512？

gsxf997

update converter

xu-song

MLM能否正常inference

Hi，感谢CLUE团队对中文NLP社区做的贡献~ 我下载了你们提供的预训练模型，发现Masked LM的预测似乎比较随机，我猜测原因是MLM model中的一些参数没有被保存下来？不知道你们是否方便将这部分参数也release出来呢？

YuxianMeng

CLUE发布的roberta模型，预训练时是否使用了wwm呢？

像brightmart发布的预训练roberta，有明确指明使用了wwm

waywaywayw

roberta_tiny_clue在IFLYTEK’的训练参数

您好，请问是否方便公开下roberta_tiny_clue在iflytek'数据集上的训练参数，我尝试了调整不同的参数但是只得到与albert_tiny相近的结果，使用learning_rate=2e-5，batch=32，length=128，epoch=10的参数组合时准确率只有45.44%，谢谢！

selephantjy