torchrec issues

upgrade pyre version in `fbcode/torchrec` - batch 1

1

Differential Revision: D60993179

PaulZhang12

CLA Signed

fb-exported

How to share embeddings between an EmbeddingCollection and an EmbeddingBagCollection?

3

We aim to share embeddings between sparse features and sequence sparse features. For sparse features, we use `EmbeddingBagCollection`, and for sequence sparse features, we use `EmbeddingCollection`. Could you advise on...

tiankongdeguiji

[Question] Is there FP8 embeddings support for training?

2

Hello, it looks like EmbeddingBagCollection forces data type to be float32 or float16 during initialization. https://github.com/pytorch/torchrec/blob/main/torchrec/modules/embedding_modules.py#L179 Is there any support to make embedding be float8? Note, this is for training....

ShijieZZZZ

fix torch rec test failure

1

Summary: Fixes T192448049. The module call form an unusal call stack for the nodes: https://www.internalfb.com/phabricator/paste/view/P1507230978. This is currently not supported by unflattener and need some extra design to make it...

ydwu4

CLA Signed

fb-exported

[Question] pooling and aggregation operations

12

in the forward pass, in the table wise sharding, when pooling is executed? is it after alltoall communication? and executed on trainer local? where can I see the exact code...

xiexbing

In unflattened model case, modify to_copy nodes in DMP to be on appropriate device

1

Summary: When torch.exporting, device specializations can occur: https://fb.workplace.com/groups/1075192433118967/posts/1474166129888260/?comment_id=1474191496552390&reply_comment_id=1474683806503159 such as here: https://fburl.com/code/94ta7omp. Currently, the best solution is to do another pass of the graph and modify the device accordingly when...

PaulZhang12

CLA Signed

fb-exported

oss cpu test segfault

2

Differential Revision: D60434109

dstaay-fb

CLA Signed

fb-exported

Replace runners prefix amz2023.

testing new runners

jeanschmidt

CLA Signed

Cannot work when use DATA_PARALLEL with FusedEmbeddingBagCollection

1

I am trying to apply DATA_PARALLEL on the small embedding tables and it can work in EmbeddingBagCollection. However, when it comes to FusedEmbeddingBagCollection, it doesn't work and gets an error...

imh966

Faster KJT init

1

Summary: To improve inference, we want to make creating a KJT as cheap as possible, which means the init method is nothing more than a attribute setter. All other fields...

dstaay-fb

CLA Signed

fb-exported

torchrec
torchrec copied to clipboard

Metadata

upgrade pyre version in `fbcode/torchrec` - batch 1

How to share embeddings between an EmbeddingCollection and an EmbeddingBagCollection?

[Question] Is there FP8 embeddings support for training?

fix torch rec test failure

[Question] pooling and aggregation operations

In unflattened model case, modify to_copy nodes in DMP to be on appropriate device

oss cpu test segfault

Replace runners prefix amz2023.

Cannot work when use DATA_PARALLEL with FusedEmbeddingBagCollection

Faster KJT init

← Metadata

Owner

Metadata

torchrec torchrec copied to clipboard

Metadata

← Metadata

Owner

Metadata

torchrec
torchrec copied to clipboard