torchrec
torchrec copied to clipboard
Pytorch domain library for recommendation systems
Summary: This diff changes unnecessary GreedyPerfPartitioner class methods to function - gets rid of GreedyPerfPartitioner, and only leaves needed functions. Subsequently, GreedyPerfPartitioner was changed to a partition function in parallelized...
Summary: -> this diff discards bad sharding_type-kernel combinations from enumerator; by so, we make sure that we never consider bad sharding plans: "data-parallel" sharding type is never with "batched-fused","batched-fused-uvm", or...
Summary: Add JaggedTensorMeta Differential Revision: D37840024
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM). New submodule commit: https://github.com/pytorch/FBGEMM/commit/bff21c73487c5dc501acddb4788d985e9487bd68 Test Plan: Ensure that CI jobs succeed on GitHub before landing.
Summary: see https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html Reviewed By: mrfox321 Differential Revision: D37792917
Summary: First generate the data: bash nvt_preproc.sh /data/criteo/ /data/criteo_1_day/ 8192 Then run the command: torchx run -s local_cwd dist.ddp -j 1x8 --script train_torchrec.py -- --num_embeddings_per_feature 45833188,36746,17245,7413,20243,3,7114,1441,62,29275261,1572176,345138,10,2209,11267,128,4,974,14,48937457,11316796,40094537,452104,12606,104,35 --over_arch_layer_sizes 1024,1024,512,256,1 --binary_path /data/criteo_1_day/criteo_preproc/train/...
Summary: Motivation is that we want to OSS quantized comm library, and refactor torchrec quant comms support This diff is rather large (used to be a bunch of small diffs)....
Base training loop examples run cmd `torchx run -s local_cwd dist.ddp -j 1x8 --script train_dlrm.py ` Some TODO items: 1. Add NE/QPS metrics checkpointing 2. Show saving this model and...
Summary: Rename quantized comms config Differential Revision: D37221312