keras-transformer icon indicating copy to clipboard operation
keras-transformer copied to clipboard

Allow Self-Attention in TransformerBlock

Open neonbjb opened this issue 6 years ago • 3 comments

Adds a use_self_attention parameter to the TransformerBlock constructor which allows the use of this block in self attention mode. This is useful for creating decoders in machine translation tasks, for example.

Feel free to revert the changes I made to support Python 3.

neonbjb avatar Feb 26 '19 05:02 neonbjb

Hi! Could you please post an example that utilizes these changes? Perhaps a function that builds a model, similar to vanilla_transformer_gpt_model.

kpot avatar Feb 26 '19 13:02 kpot

Absolutely! I have an sample NMT model which works pretty well with your library. I'll need a day or two to clean it up before submitting it, though.

neonbjb avatar Feb 26 '19 14:02 neonbjb

Done - I've tested run_nmt on tensorflow. Not sure what backend you're using but I don't suspect there should be a problem with others as I never directly use tf.

neonbjb avatar Feb 27 '19 03:02 neonbjb