Atze00

Results 6 comments of Atze00

Unfortunately I have no plans to support TorchScript at the moment. I'm also not convinced that's the only problem, for example the computation of same padding could raise an error....

Can you provide more information? Maybe a log of the training. It's probably due to the fact that it converges from the first epoch

The notebook functions correctly, also I use the networks daily. It's unlikely these are problem related to part of the code of this repository. If you provide a colab short...

hi! The models should follow the same block of the other models, so It shouldn't be necessary to implement new blocks. I suggest to check if the A3, A4 and...

Thanks for the reply, that's both interesting and counter-intuitive. In my case, it would cause some unwanted behavior. In particular, using the Transformer-XL with recurrence, the output results would change...

Isn't the flag already present? ```use_pos_emb``` should work fine. This definition for the transformer-XL should work also for the ALiBi positional encoding: ```python model_xl = TransformerWrapper( num_tokens = 20000, max_seq_len...