Hello, I'm the student who previously inquired about the issue with Flash Attention not working in VGGT. I have installed the Windows version of the Flash Attention wheel file and successfully passed code tests confirming it works properly. However, I'm still encountering errors when running your code. This is quite confusing for me. Could you please provide some suggestions?

This is my whl file

This is code file:1.txt

This is the test code

This is the output result

Shape consistency: True Maximum absolute error: 0.001953125 Within tolerance range: True

Performance Comparison: Flash Attention average time: 0.214 ms Standard attention average time: 4.410 ms Speedup ratio: 20.6x

I noticed that your code uses PyTorch 2.3.1, while I'm using PyTorch 2.4.0.

I modified `torch.cuda.amp.autocast` in the code to `torch.amp.autocast('cuda', ...)` and made no other changes.

cuda=12.4, flash attention=2.7.4， pytorch=2.4.0, python=3.11.11, windows

Also, when I used another computer with CUDA 12.1, Flash Attention 2.7.0, PyTorch 2.3.1 on Windows, I encountered the same error.

May 21 '25 03:05 twtynije

Hi, we use the flash attention embeded in pytorch F.scaled_dot_product_attention. If you have installed flash attention by yourself, you need to replace the line below to your one

https://github.com/facebookresearch/vggt/blob/c4b5da2d8592a33d52fb6c93af333ddf35b5bcb9/vggt/layers/attention.py#L61

May 21 '25 13:05 jytime

I'm using the same Flash Attention codebase (with identical naming/implementation) as yours, but my version runs on Windows, while yours likely operates in a Linux environment. This is my test.ipynb

When I run your attention module code, I found it still doesn't work. So I removed your F. prefix to match the format of my test code, but it still doesn't work. I can confirm that the naming is exactly the same—no missing or extra letters in the function/module names.

May 22 '25 02:05 twtynije

Hi, This doesn't look like a problem with the repository. Check for issues with the Flash Attention module on Windows. Maybe this can solve your problem (the program triggers the same error) https://stackoverflow.com/questions/78746073/how-to-solve-torch-was-not-compiled-with-flash-attention-warning or https://github.com/Dao-AILab/flash-attention/issues/962

May 22 '25 10:05 vgutierrez2404

flash attention is not werk in VGGT

This is my whl file

This is code file:1.txt

This is the test code

This is the output result

I noticed that your code uses PyTorch 2.3.1, while I'm using PyTorch 2.4.0.

I modified torch.cuda.amp.autocast in the code to torch.amp.autocast('cuda', ...) and made no other changes.

cuda=12.4, flash attention=2.7.4， pytorch=2.4.0, python=3.11.11, windows

Also, when I used another computer with CUDA 12.1, Flash Attention 2.7.0, PyTorch 2.3.1 on Windows, I encountered the same error.

I modified `torch.cuda.amp.autocast` in the code to `torch.amp.autocast('cuda', ...)` and made no other changes.