keras-transformer Allow Self-Attention in TransformerBlock

Allow Self-Attention in TransformerBlock

Open neonbjb opened this issue 6 years ago • 3 comments

Adds a use_self_attention parameter to the TransformerBlock constructor which allows the use of this block in self attention mode. This is useful for creating decoders in machine translation tasks, for example.

Feel free to revert the changes I made to support Python 3.

Feb 26 '19 05:02 neonbjb

Hi! Could you please post an example that utilizes these changes? Perhaps a function that builds a model, similar to vanilla_transformer_gpt_model.

Feb 26 '19 13:02 kpot

Absolutely! I have an sample NMT model which works pretty well with your library. I'll need a day or two to clean it up before submitting it, though.

Feb 26 '19 14:02 neonbjb

Done - I've tested run_nmt on tensorflow. Not sure what backend you're using but I don't suspect there should be a problem with others as I never directly use tf.

Feb 27 '19 03:02 neonbjb

keras-transformer keras-transformer copied to clipboard

Allow Self-Attention in TransformerBlock

keras-transformer
keras-transformer copied to clipboard