uRENu

Results 9 comments of uRENu

I have the same problem, and in my opinion, to solve this problem I customize a mapping table to ensure that the indices of the training set and the test...

> Can you try using a later TensorRT version? TRT 5.1 is not being actively supported anymore. I have upgraded TensorRT to 7.0.0. now the following error occurs in step...

您好,我看到requirements-post-training.txt中环境要求Keras==2.3.0、tensorflow-gpu==1.14.0,但是这两个版本不兼容。因此我用了Keras==2.3.1、tensorflow-gpu==1.15.0,但是运行预训练总是会出现如下问题: InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights' with dtype float and shape [?] [[{{node mlm_loss_sample_weights}}]] [[loss/Identity/_2901]] (1) Invalid argument:...

补充一下,上述问题出现在运行pair-post-training-wwm-sop.py时

重新检查了一下,发现应该是数据输入的格式问题。但是又出现了个新问题train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss, mlm_acc] ) ;ValueError: Output tensors to a Model must be the output of a Keras `Layer` (thus holding past layer metadata). Found:

最后通过改变datagenerator的输出解决了 class data_generator(DataGenerator): """数据生成器 """ def __iter__(self, shuffle=False): batch_token_ids, batch_segment_ids, batch_output_ids,batch_is_masked = [], [], [],[] y=[] for is_end, (text1,text2,_) in self.get_sample(shuffle): token_ids, segment_ids, output_ids = sample_convert( text1, text2) is_masked =...

> > 运行的哪个脚步?具体错误信息是什么?你这样无头无尾的我好难猜啊 > > 不好意思,我是在您的pretraining.py的基础上自己写了一个带有数据处理的pretrain代码,因为在data_generator没有输出mlm_loss和mlm_acc的预设值,直接在这部分输出了None,所以遇到了报错: > InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?]...

我遇到的另一个问题是在使用 toolkit4nlp.optimizers 时,应用wramup有问题,他并没有按我设定的值来增加学习率: ![image](https://user-images.githubusercontent.com/54517380/113263799-d0f80700-9304-11eb-8502-f564c199fd80.png) 我设置了lr_schedule={int(len(train_generator) * epochs * 0.1): 1.0, len(train_generator) * epochs: 0.1},我的epochs是200,按我的理解应该是前20个epoch的学习率会递增到设定值(我设的是5e-5),后边的是按设定值的0.1倍学习率训练。但是我查看训练过程中的学习率值时,发现他一开始就以学习率5e-5训练了。 ![image](https://user-images.githubusercontent.com/54517380/113264475-99d62580-9305-11eb-875b-d9299c30ad9e.png)