Ke Wen comments

Results 65 comments of


                                            Ke Wen

[DTensor] allow numel 1 tensor operand to be implicitly replicate DTensor

> For all the other cases, numel() < nranks case is handled in here: https://github.com/pytorch/pytorch/blob/main/torch/distributed/_tensor/placement_types.py#L59-L61 Not sure how the pointed assert handles the case -- it only checks `sharder.dim` against...

Further explanation for `batch_isend_irecv`

Thanks, we will improve the document.

`torch.distributed` hangs when using `torch.distributed.barrier` before any other communication primitives.

The reason for the hang is complicated and yes, it is related to the code you refer to (guessing device). There are two ways to workaround it: 1. Pass a...

[pipelining] Add tests for tracing frontend

Good idea. Will doc it at the `pipeline` API level (unflattener is private).

[pipelining] Add tests for tracing frontend

@pytorchbot merge

[fused_rmsnorm] Register as a custom operator for tracing

What is "IMA" short for?

[fused_rmsnorm] Register as a custom operator for tracing

Thanks @lessw2020 . Do you think the IMA relates to the triton kernel? Can you help fix it? PP needs this fix to land. Would appreciate your help.

[fused_rmsnorm] Register as a custom operator for tracing

Thanks @lessw2020 for the demonstration. > some kind of bug between triton load masking and what is going awry when run as a custom op Can you point me to...

pipeline_tutorial failing due to dead torchtext link

I'd vote for deprecating the tutorial as nobody maintains the software or the tutorial now

FSDP+PP bug where reshard_after_forward must be true

I believe in old FSDP, where FSDP API is called on the whole model, `reshard_after_forward` can be automatically figured out (or at least there is a way to do so)....