keras-transformer
keras-transformer copied to clipboard
Keras library for building (Universal) Transformers, facilitating BERT and GPT models
currently myself and I suspect many others are installing this via `master` - could a release get tagged and pushed up for more reproducibility?
Hey! nice code man! Can you reproduce the results of the original code? If I understand correctly you only implemented the encoder side? Best, Luca
There was a typo in the diagram showing the arrangement of the layers in Universal Transformer paper
I saw that the `TransformerBlock` was designed with two modes, vanilla and non vanilla wiring. And as documented, the vanilla wiring is used for the plain transformer and non vanilla...
AddCoordinateEncoding is loaded instead of AddPositionalEncoding. Apply the following patch to position.py ``` @@ -135,5 +135,5 @@ get_custom_objects().update({ 'TransformerCoordinateEmbedding': TransformerCoordinateEmbedding, 'AddCoordinateEncoding': AddCoordinateEncoding, - 'AddPositionalEncoding': AddCoordinateEncoding, + 'AddPositionalEncoding': AddPositionalEncoding, }) ```
`python -m example.run_gpt --save lm_model.h5` TypeError: __new__() missing 2 required positional arguments: 'delimiter' and 'number' environment: python 3.6.0 tensorflow-gpu 1.12 keras 2.2.4
Great work on this code! One feature of the transformer models typically is to use a mask to handle variable length input sequences such as in https://github.com/Lsdefine/attention-is-all-you-need-keras/blob/042ce3846b80dcebb169c856f378bfe26a18c6e4/transformer.py#L89 Is there any...
Adds a use_self_attention parameter to the TransformerBlock constructor which allows the use of this block in self attention mode. This is useful for creating decoders in machine translation tasks, for...
It will throw error as follow when I set compression_window_size to integer. ``` Failed to convert object of type to Tensor. Contents: (-1, 16, None). Consider casting elements to a...