denoising-diffusion-pytorch icon indicating copy to clipboard operation
denoising-diffusion-pytorch copied to clipboard

A800 not working

Open azazelplus opened this issue 1 year ago • 1 comments

I am using a server with A800. I tried a test.py (create a empty file and just copy your code in USAGE, and run python test.py in bash) but resulted in a very slow speed. meanwhile the gpu util is 0%.

the output is:

Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda /opt/conda/lib/python3.11/contextlib.py:105: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. self.gen = func(*args, **kwds) sampling loop time step: 3%|███▉

I tried to change the code in attend.py to force to open the flash attention, but resulted the same.

azazelplus avatar Mar 29 '25 13:03 azazelplus