TensorComprehensions icon indicating copy to clipboard operation
TensorComprehensions copied to clipboard

TC ops fail to end when calculation is finished

Open Junonia opened this issue 6 years ago • 2 comments

I am experimenting a simple TC op, the calculation finished quickly but the program then hangs and never exit. Both GPU and CPU are still in full utilization even after the calculation is finished. The example matmul code works fine. I am wondering whether I am doing something wrong or there is a bug?

Below is the information -OS: Centos 7 -How you installed TC: conda -Python version: 3.6.5 -Conda version: 4.4.11

Here is code.

import tensor_comprehensions as tc
import torch
import timeit

lang = """
def pairsum(float(L, M, 3) A, float(L, M, 3) B) -> (out) {
  out(l) +=! A(l, i, 0) - A(l, j, 0) +
     A(l, i, 1) - A(l, j, 1) +
     A(l, i, 2) - A(l, j, 2) +
     B(l, i, 0) - B(l, j, 0) +
     B(l, i, 1) - B(l, j, 1) +
     B(l, i, 2) - B(l, j, 2)
  }
  """

pairsum = tc.define(lang, name="pairsum")
mat1, mat2 = torch.randn(32, 1536, 3).cuda(), torch.randn(32, 1536, 3).cuda()

def test():
  out= pairsum(mat1, mat2)

print(timeit.timeit(test, number=1000))
print("test finished")

Junonia avatar Apr 03 '18 13:04 Junonia

thanks @Junonia for the report. my guess is that the gpu might be held because of the kernel being very slow or the compilation being stuck.

tentatively passing to @ftynse to see if he has ideas or who might have ideas into what's going on here. please feel free to assign to me afterwards. :) thanks

prigoyal avatar Apr 09 '18 13:04 prigoyal

Hmmm, I've seen python from dead pytorch sessions showing up in nvidia-smi even without TC.

@prigoyal does the call to a TC function block and wait for cuda kernel to complete? The only guess I can have is that something returns early and the kernel keeps running. In this code, out is never read so there is no guarantee that the computation indeed terminated.

ftynse avatar Apr 09 '18 16:04 ftynse