Wei Wang

Results 61 comments of Wei Wang

@bryantbiggs created companion NCCL update in https://github.com/pytorch/builder/pull/1780 Per https://github.com/pytorch/pytorch/pull/124014#issuecomment-2058144686, this should be modified to point to 1780 branch to test?

The binary test failure is complaining about nccl wheel, which is available in pypi, but not available in AWS S3 index. cc @atalman

cc @jackkosaian for new findings and/or fixes.

Synced with @thakkarV that we will likely roll out a patch to v3.5.0. p.s. we are evaluating a potential fix now. Please stay tuned.

> @nWEIdia Any update on the patch? cc @thakkarV for cutlass patch

This PR seems to have improved cspdarknet cuda 12.4 accuracy test? ![image](https://github.com/pytorch/pytorch/assets/143543872/d82d4f98-2edf-4ce9-b5cc-834bfd48c791) We should consider changing the status from fail_accuracy to pass for dynamic_inductor_timm csv file.

@JieRen98 could you please post your command and "nvidia-smi" output for us to reproduce the issue? I have tried internally on a Hopper GPU and the "python setup.py install" had...

> @pytorchbot label "topic: not user facing" I would argue this is `user facing` , as potentially this code path could produce something the user might not be seeing previously....

cc @atalman @malfet please help review when you get a chance, thanks!

Linking: this is trying to fix https://github.com/pytorch/pytorch/issues/164708 @lakshayg Could you please update the PR description to mention that this fixes #164708?