Transformers4Rec
Transformers4Rec copied to clipboard
[QST] Training time for different models and datasets
❓ Questions & Help
I'm looking for people's experience in training different LM architectures from scratch on different datasets.
Details
What dataset + Language model did you use? How long did it take with your setup? If the authors of the paper are reading this, I'd like to get their training times are for the replicable experiments.
Hello @SalvadorMartiRoman . These numbers might help you:
https://github.com/NVIDIA-Merlin/publications/tree/main/2021_acm_recsys_transformers4rec/Appendices
for details about the datasets and models you can check out our paper:
https://github.com/NVIDIA-Merlin/publications/tree/main/2021_acm_recsys_transformers4rec
Hope that helps.