tensor2tensor
tensor2tensor copied to clipboard
It seems that the fc layer of the moe type has not been implemented...
Description
It seems that the fc layer of the moe type has not been implemented...
In tensor2tensor/layers/common_attention.py:289 , I can't find the the fc layer of the moe type.
cur_layers = dict(
# Attention layers:
a=multihead_attention_fn, # Multihead full attention
loc=local_attention_fn, # Local attention
locm=local_attention_masked_fn, # Local attention (masked)
red=compressed_attention_fn, # Memory-compressed attention
redm=compressed_attention_masked_fn, # Memory-compressed att (masked)
mem=memeff_attention_fn, # Memory efficient
# Feed-forward layers:
fc=conv_hidden_relu, # Fully connected
sep=sep_conv_relu, # Separable convolution (unmasked)
sepm=sep_conv_relu_masked, # Separable convolution (masked)
)
Environment information
OS: Linux
tensor2tensor 1.14.1
$ python -3.6.5
Error logs
KeyError:
"in converted code:\n relative to tensor2tensor:\n\n utils\\t2t_model.py:326 call\n sharded_logits, losses = self.model_fn_sharded(sharded_features)\n utils\\t2t_model.py:374 model_fn_sharded\n self._to_single_features_dict(transformed_features))\n models\\research\\transformer_moe.py:172 body_sharded\n x = prepostprocess(layers[ff_type])(\n\n KeyError: 'moe'\n"
Steps to reproduce:
set:
FLAGS.model = "transformer_moe"
FLAGS.hparams_set = "transformer_moe_2k"
then start train
Are there any updates on this?
@Roshanson what did you end up doing?