Wei Wang

Results 61 comments of Wei Wang

Closing as stale.

To clarify: there are different "docker" jobs: 1) https://github.com/pytorch/pytorch/actions/runs/20078596581 these are "[.github/workflows/build-manywheel-images.yml](https://github.com/pytorch/pytorch/blob/200f540502141f41d1de169e35cf3312dd09cba8/.github/workflows/build-manywheel-images.yml)" jobs and indeed, one of them was flaky: https://github.com/pytorch/pytorch/actions/runs/20078596581/job/57695727990 2) this PR also has https://github.com/pytorch/pytorch/actions/runs/20088119672/workflow these failures, and...

FYI @tinglvv https://github.com/pytorch/pytorch/pull/167933/files GCC version change landed so this PR would need to rebase. nit: please wait till the https://github.com/pytorch/pytorch/pull/167933/files is indeed not reverted to avoid potential additional issues.

What is your usual setup? We build and test apex mostly in containers. "docker pull nvcr.io/nvidia/pytorch:25.06-py3" Next time you can try to build apex wheel in the containers (you can...

This usually means the flash attention binary is not built with sm_100 TORCH_CUDA_ARCH_LIST. Could you try a rebuild of flash attention with TORCH_CUDA_ARCH_LIST set to include "10.0"? e.g. `export TORCH_CUDA_ARCH_LIST="9.0...

PyTorch 2.8 aarch64 pytorch wheel was a planning miss and won't likely be fixed. Can you try 2.9RC pytorch wheel? https://dev-discuss.pytorch.org/t/pytorch-2-9-rc1-produced-for-pytorch-audio-vision/3234

PyTorch 2.9 was released with aarch64 cuda support, please try the cu130 binary, e.g. pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu130 Oh, that was pytorch... On flash-attention, I don't have updates.

Just in case you are not aware and in case this could be what you need: we have monthly containers that ships flash attention. please give the container a try:...

Hi, we merged https://github.com/NVIDIA/apex/pull/1918 which should help with the issue. Is this PR still needed?

Some compression algorithm may only be featured starting a relatively recent cuda version. Please bump cuda toolkit and retry the build.