cutlass issues

[QST] Any theory about the "layout algebra"?

7

**What is your question?** Hi there, thank you for the work on CUTLASS3.0/CuTe. The "layout algebra" in 3.0 is much more elegant and easier to use than iterators. I guess...

eatingtomatoes

question

inactive-30d

CuTe

[DOC] cute/02_layout_operations.md

1

Could any body explain "Layout compatibility"? Show examples will be nice.

pengl

documentation

? - Needs Triage

inactive-30d

CuTe

[FEA] cute hopper conv example

2

how to implement general conv fwd/dgrad/wgrad by cute? could you give examples based on hopper cute?

zhang662817

feature request

help wanted

inactive-30d

inactive-90d

CuTe

[QST] Support for Sparse Tensor Operations in CuTe

2

I have gone through the documentation and the available APIs, but I couldn't find explicit information on whether CuTe supports sparse tensor operations or not. Does CuTe currently support sparse...

jiguanglizipao

feature request

question

inactive-30d

CuTe

[QST]how to understand "Semaphore"

2

Hello, every cutlass experts, I'm confused by the implementation of Semaphore. its "fetch" like this: ```C++ if (wait_thread) { #if defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 700 asm volatile ("ld.global.acquire.gpu.b32 %0, [%1];\n"...

Shaquille-Wu

question

? - Needs Triage

inactive-30d

inactive-90d

[FEA] CUTLASS should ensure all its symbols are hidden from shared object libraries

2

**Is your feature request related to a problem? Please describe.** As a user of CUTLASS, I would like to build a shared object library, `libA.so`, that internally uses CUTLASS function...

jrhemstad

feature request

inactive-30d

inactive-90d

[QST] two files are included in each other

1

https://github.com/NVIDIA/cutlass/blob/5c447dd84f8ae0e1d48ff9a2eae26ce8c4958101/include/cutlass/gemm/warp/default_mma_tensor_op.h#L121 https://github.com/NVIDIA/cutlass/blob/5c447dd84f8ae0e1d48ff9a2eae26ce8c4958101/include/cutlass/gemm/warp/default_mma_tensor_op_sm80.h#L43 `default_mma_tensor_op.h` includes `default_mma_tensor_op_sm80.h`, while the later also includes the former. Is this a problem?

wzhcz8902

inactive-30d

[QST] typo in comment

1

### Discussed in https://github.com/NVIDIA/cutlass/discussions/1504 Originally posted by **wzhcz8902** April 28, 2024 https://github.com/NVIDIA/cutlass/blob/5c447dd84f8ae0e1d48ff9a2eae26ce8c4958101/include/cutlass/gemm/warp/mma_tensor_op.h#L140-L168 As a newbie to cutlass, I think this struct is targeted for tensor cores, not cuda cores as...

wzhcz8902

inactive-30d

feat: support kFactor 8 used in mma tensor op tile iterator

2

support Layout::kFactor with 8 loading data from shared memory: |0 | 16 | 32 | 48 | 64 | 80 | 96 | 112| |-- | -- | -- |...

gavinchen430

inactive-30d

[QST] could you please help me understand how right_inverse work?

5

**What is your question?** I am trying to understand how the right_inverse works in the cute. For example, https://github.com/NVIDIA/cutlass/blob/main/test/python/pycute/test_right_inverse.py#L88 The given input is `Layout((2,4,6),(4,1,8))`, I just couldn't figure out why...

jakedevtec

question

cutlass
cutlass copied to clipboard

Metadata

[QST] Any theory about the "layout algebra"?

[DOC] cute/02_layout_operations.md

[FEA] cute hopper conv example

[QST] Support for Sparse Tensor Operations in CuTe

[QST]how to understand "Semaphore"

[FEA] CUTLASS should ensure all its symbols are hidden from shared object libraries

[QST] two files are included in each other

[QST] typo in comment

feat: support kFactor 8 used in mma tensor op tile iterator

[QST] could you please help me understand how right_inverse work?

← Metadata

Owner

Metadata

cutlass cutlass copied to clipboard

Metadata

← Metadata

Owner

Metadata

cutlass
cutlass copied to clipboard