keras-transformer
keras-transformer copied to clipboard
Allow Self-Attention in TransformerBlock
Adds a use_self_attention parameter to the TransformerBlock constructor which allows the use of this block in self attention mode. This is useful for creating decoders in machine translation tasks, for example.
Feel free to revert the changes I made to support Python 3.
Hi!
Could you please post an example that utilizes these changes? Perhaps a function that builds a model, similar to vanilla_transformer_gpt_model
.
Absolutely! I have an sample NMT model which works pretty well with your library. I'll need a day or two to clean it up before submitting it, though.
Done - I've tested run_nmt on tensorflow. Not sure what backend you're using but I don't suspect there should be a problem with others as I never directly use tf.