punica icon indicating copy to clipboard operation
punica copied to clipboard

sgmv_cutlass calculate wrong output

Open harryhan618 opened this issue 1 year ago • 8 comments

I'm running the following code and find the answer goes wrong. I initialize the x and w to be all ones. So the output y value should be h1=4096.

But my output is not. Half of the output is 4096 and the other half is 2528. Weird! My observation is that the wrong answer happens when h2>=32 for shrink.

The following code is adapted from benchmarks/bench_sgmv_cutlass.py

import torch
import punica.ops

bs = 4
h1 = 4096
h2 = 32
num_layers = 1
dtype = torch.float16
device = torch.device("cuda:0")
problem_sizes = [2, 2]

w = [
      torch.ones((num_layers, h1, h2), dtype=dtype, device=device)
      for _ in range(len(problem_sizes))
  ]
w_ptr = torch.tensor([t.data_ptr() for t in w],
                     dtype=torch.int64,
                     device=device)
s = torch.cumsum(
    torch.tensor([0] + problem_sizes, device=device),
    dim=0,
    dtype=torch.int32)
x = torch.ones((s[-1], h1), dtype=dtype, device=device)
y = torch.zeros((s[-1], h2), dtype=dtype, device=device)
punica.ops.sgmv_cutlass(y, x, w_ptr, s, layer_idx=0)

print(y)

harryhan618 avatar Nov 17 '23 09:11 harryhan618