ziyuhuang123
ziyuhuang123
I am learning https://github.com/NVIDIA/cutlass/blob/06b21349bcf6ddf6a1686a47a137ad1446579db9/include/cutlass/gemm/collective/sm90_mma_tma_gmma_ss_warpspecialized.hpp#L73 I am curious that why we have two consumer here? Just have one maybe better?
I am writing example 48, and I noticed file: https://github.com/NVIDIA/cutlass/blob/06b21349bcf6ddf6a1686a47a137ad1446579db9/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp#L321 I find out with surprise that, epilogue is in producer, but we do not even enter epilogue?! Previously I guess...
[sm90_gemm_tma_warpspecialized_cooperative](https://github.com/NVIDIA/cutlass/blob/06b21349bcf6ddf6a1686a47a137ad1446579db9/include/cutlass/gemm/kernel/sm90_gemm_tma_warpspecialized_cooperative.hpp#L326) ``` enum class ProducerWarpRole { Mainloop = 0, Warp1 = 1, Epilogue = 2, Warp3 = 3 }; ``` I find usage of Mainloop and Epilogue, but no usage...
``` pytest -q -s test_flash_attn.py ``` And I meet this: ========================================= short test summary info ========================================== FAILED test_flash_attn.py::test_flash_attn_output[257-1-1.0-128-False-False-mha-dtype0] - AssertionError: assert 0.0078125
Could you provide a valid mirror? Thanks!
``` /home/zyhuang/anaconda3/envs/py_hzy_new/bin/../lib/gcc/x86_64-conda-linux-gnu/11.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lcuda: No such file or directory ```
Could you provide a gemm kernel?
firstly the env.src is incorrect, and I modify it to: ``` # So that you can see the python packages from the tests export PYTHONPATH=${PYTHONPATH}:$PWD/include/common/pyutils export THUNDERKITTENS_ROOT=${PWD}/include ``` But I...
**What is your question?** I encountered a strange bug. Firstly, my SMEM is divided into two regions. One part is for the mainloop (reading A and B), and the other...
Hi! I am wondering can I create a null tensor? I am creating a tensor within a {} but need to use it in another {}. The lifetime of a...