InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]]
您好,我运行代码的环境是: tf-gpu 1.15 keras 2.3.1 toolkit4nlp 0.5.0 一直在计算loss时报错,请问是版本原因还是loss计算原因呢
重新检查了一下,发现应该是数据输入的格式问题。但是又出现了个新问题train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss, mlm_acc] ) ;ValueError: Output tensors to a Model must be the output of a Keras Layer
(thus holding past layer metadata). Found: <function mlm_loss at 0x7fa2f519e8c0>
最后通过改变datagenerator的输出解决了 class data_generator(DataGenerator): """数据生成器 """ def iter(self, shuffle=False): batch_token_ids, batch_segment_ids, batch_output_ids,batch_is_masked = [], [], [],[] y=[] for is_end, (text1,text2,_) in self.get_sample(shuffle):
token_ids, segment_ids, output_ids = sample_convert(
text1, text2)
is_masked = [0 if i == 0 else 1 for i in output_ids]
if is_end or len(batch_token_ids) == self.batch_size:
batch_token_ids = pad_sequences(batch_token_ids,maxlen=maxlen)
batch_segment_ids = pad_sequences(batch_segment_ids,maxlen=maxlen)
batch_output_ids = pad_sequences(batch_output_ids,maxlen=maxlen)
batch_is_masked = pad_sequences(batch_is_masked,maxlen=maxlen)
yield [batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked], np.array(y)
batch_token_ids, batch_segment_ids, batch_output_ids, batch_is_masked = [], [], [], []
只要np.array(y)地方不是None 就不会报错 我理解的是这部分的值可以随便传一个,因为在计算mlm_loss的时候始终返回的是计算的loss. def mlm_loss(inputs): """计算loss的函数,需要封装为一个层 """ y_true, y_pred, mask = inputs y_true = K.cast(y_true, K.floatx()) mask = K.cast(mask, K.floatx()) loss = K.sparse_categorical_crossentropy( y_true, y_pred, from_logits=True ) loss = K.sum(loss * mask) / (K.sum(mask) + K.epsilon()) return loss mlm_loss = Lambda(mlm_loss, output_shape=(None, ),name='mlm_loss')([token_ids, proba, is_masked])
train_model = Model( bert.model.inputs + [token_ids, is_masked], [mlm_loss])
loss = { 'mlm_loss': lambda y_true, y_pred: y_pred, # 只返回y_pred, y_pred就是mlm_loss }
不好意思,我是在您的pretraining.py的基础上自己写了一个带有数据处理的pretrain代码,因为在data_generator没有输出mlm_loss和mlm_acc的预设值,直接在这部分输出了None,所以遇到了报错: InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]] [[Mean_3/_7863]] (1) Invalid argument: You must feed a value for placeholder tensor 'mlm_loss_sample_weights_3' with dtype float and shape [?] [[{{node mlm_loss_sample_weights_3}}]]
我遇到的另一个问题是在使用 toolkit4nlp.optimizers 时,应用wramup有问题,他并没有按我设定的值来增加学习率:
我设置了lr_schedule={int(len(train_generator) * epochs * 0.1): 1.0, len(train_generator) * epochs: 0.1},我的epochs是200,按我的理解应该是前20个epoch的学习率会递增到设定值(我设的是5e-5),后边的是按设定值的0.1倍学习率训练。但是我查看训练过程中的学习率值时,发现他一开始就以学习率5e-5训练了。