cutlass issues

[QST]question about cutlass epilogue customization

16

**What is your question?** May I ask if the epilogue of Cutlass supports customization? I hope to achieve the functionality of performing bias addition after the matmul operation in Cutlass....

zwshan

question

inactive-30d

[QST] thread num assert in sm70_epilogue_vectorized

3

Hi, @thakkarV https://github.com/NVIDIA/cutlass/blob/47a3ebbea9860e14c095b52c4e6e2db33340f572/include/cutlass/epilogue/collective/sm70_epilogue_vectorized.hpp#L237 Strangely, it requires TiledCopyS2R's threads equal to the MMA AtomC's threads. I think here we describe how each thread does LDS and therefore it should be: ```c++...

mammoth831

question

? - Needs Triage

inactive-30d

inactive-90d

[QST] Is it possible to detect output coordinates in elementwise epilogue ?

4

**What is your question?** Hello! I want to implement elementwise epilogue, which depends on output matrix coordinates, i.e. ``` d_ij = F(alpha * sum_k(a_ik * b_kj) + c_ij, i, j)...

ankutalev

question

? - Needs Triage

[QST] Is there any other legal layout in cutlass?

1

I see this in example code: https://github.com/NVIDIA/cutlass/blob/main/examples/cute/tutorial/sgemm_nt_1.cu So I wonder is there any other legal layout? // Define block sizes (static) auto bM = Int{}; auto bN = Int{}; auto...

ziyuhuang123

question

[QST] What is the difference between make_shape and make_tile?

5

Thank you for your great work, for both cutlass and cute. I'm following instructions to build my program. I use make_tensor to build rav as a pointer to specific register...

ZhangZhiPku

question

inactive-30d

inactive-90d

[QST] Cosize and Complement

3

Can someone walkthrough how `cosize` and `complement` are derived through a few simple examples? For example, when I run the `complement.cpp` `CuTe` unit test with `CUTLASS_TRACE_DEBUG_LEVEL=1`, the following is printed,...

jeromeku

question

? - Needs Triage

inactive-30d

[BUG] About the rounding mode of TF32

2

https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/tfloat32.h#L80 Why does the code represent "round toward nearest even", but not "round to nearest, ties away from zero"?

beginlner

inactive-30d

[QST]Could not find compiler set in environment variable CUDACXX:

3

I can use nvcc in my environment, but when I run "cmake .. -DCUTLASS_NVCC_ARCHS=80", an error happens: -- CMake Version: 3.18.2 CMake Error at /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:25 (message): Could not find compiler...

Arthur-Ling

question

inactive-30d

[BUG] right_inverse with 1d layout results "#2700-D: attempt to access expired storage" on constexpr return value.

8

**Describe the bug** **Steps/Code to reproduce bug** ```cuda #include "cute/tensor.hpp" using namespace cute; __global__ void kernel() { constexpr auto weird = right_inverse(make_layout(_2{}, _1{})); print(weird); } int main() { kernel(); cudaDeviceSynchronize();...

cloudhan

bug

? - Needs Triage

[QST]iterator store interface[store(Fragment &frag,TensorCoord const &tile_offset)] not offset in units of whole tiles

13

In file include\cutlass\gemm\warp\mma_tensor_op_tile_iterator.h, In the store interface of iterator, As explained in the comments, "stores a tile with a logical offset in units of whole tiles", However, TensorRef is actually...

zhiyu-deep

question

inactive-30d

inactive-90d

cutlass
cutlass copied to clipboard

Metadata

[QST]question about cutlass epilogue customization

[QST] thread num assert in sm70_epilogue_vectorized

[QST] Is it possible to detect output coordinates in elementwise epilogue ?

[QST] Is there any other legal layout in cutlass?

[QST] What is the difference between make_shape and make_tile?

[QST] Cosize and Complement

[BUG] About the rounding mode of TF32

[QST]Could not find compiler set in environment variable CUDACXX:

[BUG] right_inverse with 1d layout results "#2700-D: attempt to access expired storage" on constexpr return value.

[QST]iterator store interface[store(Fragment &frag,TensorCoord const &tile_offset)] not offset in units of whole tiles

← Metadata

Owner

Metadata

cutlass cutlass copied to clipboard

Metadata

← Metadata

Owner

Metadata

cutlass
cutlass copied to clipboard