torchrec icon indicating copy to clipboard operation
torchrec copied to clipboard

I want to use CPU-based distributed approach to train a small recommendation model. Is there a demo available for me to refer to?

Open Lenan22 opened this issue 6 months ago • 1 comments

Lenan22 avatar Jun 03 '25 12:06 Lenan22

Have you looked in to https://github.com/pytorch/torchrec/tree/main/examples/golden_training

yupadhyay avatar Jun 10 '25 19:06 yupadhyay

Close this issue since it's already answered

Have you looked in to https://github.com/pytorch/torchrec/tree/main/examples/golden_training

I would suggest you to provide more context on your use case, e.g., how mall is the model and why you'll need a distributed approach for such a small model. cpu machine can easily have > 1T memory while a single gpu has < 100G. I won't consider a 1T+ model a small one.

TroyGarden avatar Jun 22 '25 03:06 TroyGarden