generative-recommenders
generative-recommenders copied to clipboard
What computing power configuration is required?
What computing power configuration is required for trainging the GRs? With the experiment setup in the article, i.e. 256 H100, how long should the model be trained?