bert4keras
bert4keras copied to clipboard
T5模型如何加载多个decoder
提问时请尽可能提供如下信息:
基本信息
- 你使用的操作系统:
- 你使用的Python版本: 3.6
- 你使用的Tensorflow版本: 1.15
- 你使用的Keras版本: 2.3.1
- 你使用的bert4keras版本: 0.11.3
- 你使用纯keras还是tf.keras:
- 你加载的预训练模型:mt5.1.1
核心代码
class Multi_decoder(tf.keras.Model):
def __init__(self, encoder, decoder):
super().__init__()
self.encoder = encoder
self.decoder = decoder
def call(self, inputs):
encoder_input, decoder_input = inputs
encoder_encodings, encoder_masks = self.encoder(encoder_input)
decoder_outputs = self.decoder([decoder_input, encoder_encodings, encoder_masks])
return decoder_outputs
输出信息
Traceback (most recent call last):
File "call.py", line 175, in <module>
model.fit(x=[batch_t_token_ids, batch_p_token_ids], y=batch_p_token_ids, batch_size=batch_size, epochs=epochs, callbacks=[evaluator])
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 727, in fit
use_multiprocessing=use_multiprocessing)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 643, in fit
shuffle=shuffle)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2418, in _standardize_user_data
all_inputs, y_input, dict_inputs = self._build_model_with_inputs(x, y)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2621, in _build_model_with_inputs
self._set_inputs(cast_inputs)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2708, in _set_inputs
outputs = self(inputs, **kwargs)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 854, in __call__
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
raise e.ag_error_metadata.to_exception(e)
NameError: in converted code:
call.py:23 call *
encoder_encodings, encoder_masks = self.encoder(encoder_input)
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/base_layer.py:506 __call__ *
output_shape = self.compute_output_shape(input_shape)
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/engine/network.py:656 compute_output_shape *
output_shape = layer.compute_output_shape(
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:173 compute_output_shape *
output_shape = self._compute_elemwise_op_output_shape(output_shape,
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/keras/layers/merge.py:50 _compute_elemwise_op_output_shape *
for i, j in zip(shape1[-len(shape2):], shape2):
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:339 for_stmt
return _py_for_stmt(iter_, extra_test, body, get_state, set_state, init_vars)
/Users/zhangkaizhou/opt/anaconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:348 _py_for_stmt
if extra_test is not None and not extra_test(*state):
/var/folders/v_/m84qz0751dv95zzxzwll27840000gp/T/tmpnbr9tvxs.py:158 extra_test
return ag__.not_(do_return_2)
NameError: free variable 'do_return_2' referenced before assignment in enclosing scope
自我尝试
您好,我想尝试基于T5的多个decoder,也就是将T5拆解开,decoder复制多个。思路是通过build_transformer_model加载多个decoder,目前还是单个decoder,这样就已经跑不通了。通过这样的代码实现方式能否实现呢
看错误信息,似乎跟模型实现没有关系?