composable_kernel
composable_kernel copied to clipboard
Kernarg latency opt
This depends on PR#1028. The new files modified are few:
modified: example/53_gemv_splitk/CMakeLists.txt
modified: example/54_tall_and_skinny_gemm_splitk/CMakeLists.txt
modified: example/54_tall_and_skinny_gemm_splitk/run_tall_and_skinny_gemm_splitk_example.inc
modified: include/ck/host_utility/kernel_launch.hpp
modified: include/ck/tensor_operation/gpu/device/impl/device_tall_and_skinny_gemm_splitk.hpp
modified: include/ck/tensor_operation/gpu/grid/gridwise_tall_and_skinny_gemm_splitk.hpp
conflict resolved: library/src/tensor_operation_instance/gpu/CMakeLists.txt