torchrec icon indicating copy to clipboard operation
torchrec copied to clipboard

Pytorch domain library for recommendation systems

Results 455 torchrec issues
Sort by recently updated
recently updated
newest added

Summary: # Context * Currently torchrec IR serializer can't handle variable batch KJT use case. * To support VBE KJT, the `stride_per_key_per_rank` field needs to be flattened as a variable...

CLA Signed
fb-exported

Differential Revision: D73051959

CLA Signed
fb-exported

Summary: This diff is a followup of D73474285 and lets other dense optimizers take it the `enable_global_grad_clip` optim config. When `enalbe_global_grad_clip=True` and FSDP1 is used, it would calculate global gradient...

CLA Signed
fb-exported

Summary: See D73051959 for context. Update the `_maybe_compute_stride_kjt` logic to calculate stride based off of `inverse_indices` for VBE KJTs. Currently, stride of VBE KJT with `stride_per_key_per_rank` is calculated as the...

CLA Signed
fb-exported

RT: currently, the tfra community has integerated nvidia/hkv as embedding-storage. can torchrec use the lib as embedding?

Summary: Adding a simple unsharded module reference to sharded modules. This will be used in Dynamic Sharding by `DistributedModelParallel` to reshard an already-sharded_module. As DMP is created with only one-way...

CLA Signed
fb-exported

i want to freeze EmbeddingCollection mid training, but setting the learning rate to 0.0 does not save cuda memory and computing resources. is there a method similar to the following...

Summary: astitled Differential Revision: D73450618

CLA Signed
fb-exported

We found that the throughput of embeddingbagcollection 16p is only 1.2 times that of 8p. Are there any optimization measures?