transformer-xl icon indicating copy to clipboard operation
transformer-xl copied to clipboard

PyTorch: pretrained models

Open stefan-it opened this issue 6 years ago • 5 comments

Hi,

thanks for the releasing the TensorFlow and PyTorch code for your Transformer-XL :heart:

I would like to ask, if you plan to provide some pre-trained models for the PyTorch implementation? I was only able to find the TensorFlow checkpoints...

Thanks in advance,

Stefan

stefan-it avatar Jan 10 '19 09:01 stefan-it

Thanks for pointing this out. We obtained the SoTA results using TF so at this point do not have a native pytorch SoTA model available. Presumably there are tools for converting TF models into pytorch, and PRs on this issue are welcome.

kimiyoung avatar Jan 11 '19 06:01 kimiyoung

Hey, I'm interested in using the pytorch version of your repo. I was wondering if you have trained a model in pytorch and how did it compare to the tensorflow version? That is, I'm wondering if the pytorch code has been tested and is close to SOTA?

Thanks

arvieFrydenlund avatar Jan 17 '19 22:01 arvieFrydenlund

For small models (e.g. Transformer-XL base), we have compared the performance of the pytorch and TF implementations, and they are very close on all the settings we tested. In fact, the results for base/small models in our paper were obtained by the pytorch version. We don't have enough GPUs to run pytorch with our largest setting, but it should be able to produce the same results as the TF version.

kimiyoung avatar Jan 18 '19 00:01 kimiyoung

Excuse me! Have you trained transformer_xl model on Chinese corpus?

LindgeW avatar Dec 11 '19 13:12 LindgeW

Excuse me! Have you trained transformer_xl model on Chinese corpus?

+1, I look forward to your sharing the Chinese pre-trained model!

iamxiaoyubei avatar Aug 07 '20 03:08 iamxiaoyubei