[QST] Training time for different models and datasets

Open SalvadorMartiRoman opened this issue 2 years ago • 1 comments

❓ Questions & Help

I'm looking for people's experience in training different LM architectures from scratch on different datasets.

Details

What dataset + Language model did you use? How long did it take with your setup? If the authors of the paper are reading this, I'd like to get their training times are for the replicable experiments.

Jul 28 '22 10:07 SalvadorMartiRoman

Hello @SalvadorMartiRoman . These numbers might help you:

https://github.com/NVIDIA-Merlin/publications/tree/main/2021_acm_recsys_transformers4rec/Appendices

for details about the datasets and models you can check out our paper:

https://github.com/NVIDIA-Merlin/publications/tree/main/2021_acm_recsys_transformers4rec

Hope that helps.

Aug 02 '22 14:08 rnyak

Transformers4Rec Transformers4Rec copied to clipboard

[QST] Training time for different models and datasets

❓ Questions & Help

Details

Transformers4Rec
Transformers4Rec copied to clipboard