transformer-xl
transformer-xl copied to clipboard
PyTorch: pretrained models
Hi,
thanks for the releasing the TensorFlow and PyTorch code for your Transformer-XL :heart:
I would like to ask, if you plan to provide some pre-trained models for the PyTorch implementation? I was only able to find the TensorFlow checkpoints...
Thanks in advance,
Stefan
Thanks for pointing this out. We obtained the SoTA results using TF so at this point do not have a native pytorch SoTA model available. Presumably there are tools for converting TF models into pytorch, and PRs on this issue are welcome.
Hey, I'm interested in using the pytorch version of your repo. I was wondering if you have trained a model in pytorch and how did it compare to the tensorflow version? That is, I'm wondering if the pytorch code has been tested and is close to SOTA?
Thanks
For small models (e.g. Transformer-XL base), we have compared the performance of the pytorch and TF implementations, and they are very close on all the settings we tested. In fact, the results for base/small models in our paper were obtained by the pytorch version. We don't have enough GPUs to run pytorch with our largest setting, but it should be able to produce the same results as the TF version.
Excuse me! Have you trained transformer_xl model on Chinese corpus?
Excuse me! Have you trained transformer_xl model on Chinese corpus?
+1, I look forward to your sharing the Chinese pre-trained model!