torchrec
torchrec copied to clipboard
base example
Base training loop examples
run cmd
torchx run -s local_cwd dist.ddp -j 1x8 --script train_dlrm.py
Some TODO items:
- Add NE/QPS metrics checkpointing
- Show saving this model and then loading it in later for inference
@YLGH has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.