ScaledDotProduct cannot run on cuda:1

Open 1049451037 opened this issue 1 year ago • 5 comments

🐛 Bug

When I run on cuda:0, everything works fine. But when I run cuda:1, the following error occurs:

Triton softmax kernel register spillover or invalid image caught.Deactivating this kernel, please file an issue int the xFormers repository
Triton Error [CUDA]: context is destroyed

Command

To Reproduce

from xformers.components.attention import ScaledDotProduct

model = ScaledDotProduct().cuda(1)
import torch
q = torch.randn(16, 16, 64).cuda(1).requires_grad_()
k = torch.randn(16, 8, 64).cuda(1).requires_grad_()
v = torch.randn(16, 8, 64).cuda(1).requires_grad_()
mask = torch.tensor([[True] + [False]*7] * 16, dtype=torch.bool).cuda(1)
out = model(q, k, v, att_mask=mask)
breakpoint()

Environment

latest xformers built from source

Mar 06 '23 12:03 1049451037

I'm not sure about this error - maybe @fmassa you know who would be the right POC there? Also, do you have a stacktrace for the error @1049451037 ?

Mar 06 '23 16:03 danthe3rd

I am also getting this error

Jul 12 '23 15:07 pmcvay

Is it possible to disable the triton softmax kernel as a temporary workaround?

Jul 12 '23 20:07 pmcvay

Same problem here. How bad is this bug, i.e. can I simply ignore the "Triton Error"?

Feb 13 '24 16:02 vladchimescu

I think triton launches on the current cuda device and you usually want the tensors you pass it to be in that device. This means you might want to change that device manually, unlike most simple pytorch operations which run on the device of their inputs.

So in the code above, you could maybe try replacing

out = model(q, k, v, att_mask=mask)

with

with torch.cuda.device(q.device):
    out = model(q, k, v, att_mask=mask)

Feb 13 '24 17:02 bottler

xformers xformers copied to clipboard

ScaledDotProduct cannot run on cuda:1

🐛 Bug

Command

To Reproduce

Environment

xformers
xformers copied to clipboard