[QST] Gemm results are different with tile_description?
What is your question?
It seems that add tile_description would make the gemm result different? assert (tensor_D_numpy - tensor_D).max() == 0.0 would pass if I add tile_decription.
import numpy as np
import random
import torch
import cutlass
# This controls whether the C++ GEMM declaration will be printed at each step.
# Set to `False` to omit this information.
print_module = True
m = 8192
n = 8192
k = 8192
dtype=torch.float16
tensor_A = torch.rand(m, k, device=0, dtype=torch.float16)
tensor_B = torch.rand(k, n, device=0, dtype=torch.float16)
tensor_C = torch.zeros(m, n, device=0, dtype=torch.float16)
tensor_D = torch.zeros(m, n, device=0, dtype=torch.float16)
alpha = 1
beta = 0.0
# We specify `element_accumulator` here so as to match the kernel run by NumPy below. However,
# specifying `element_accumulator` is not required if it is the same as `element`
plan = cutlass.Gemm(element=dtype, layout=cutlass.LayoutType.RowMajor, element_accumulator=torch.float32)
"""
tile_description = {
"threadblock_shape": [128, 256, 32], # Threadblock shape
"stages": 3, # Number of stages
"wrap_count" : [2, 4, 1], # Number of warps within each dimension of the threadblock shape
"instruction_shape": [16, 8 , 16] ,
"cluster_shape": [1, 1 , 1]
}
plan.tile_description = tile_description
"""
plan.run(tensor_A, tensor_B, tensor_C, tensor_D, print_module=print_module)
tensor_D_numpy = (alpha * (tensor_A @ tensor_B)) + (beta * tensor_C)
assert (tensor_D_numpy - tensor_D).max() == 0.0
print(tensor_D)
Can you please list the C++ output you see after the call to plan.run() with print_module=true?
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.