composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

Results 151 composable_kernel issues
Sort by recently updated
recently updated
newest added

- add vectorization support for custom types defined as non-native types or user-defined objects - add a test to check if native and non-native vector types take same expected space...

https://github.com/ROCm/composable_kernel/blob/a90bfa9857da5cc35a9c5dc1f068b538a1e64c9b/include/ck/utility/amd_smfmac.hpp#L19 in #1309 This instruction should be built for only *gfx94* platforms

bug
urgency_blocker

Added support for LDS prefetch pipeline , enabled for one of the sources of GEMM sourced from LDS. could be applicable for both sources.

https://github.com/ROCm/composable_kernel/blob/acda4c5a3c34c13b71475fdd963e61182bba8a76/example/ck_tile/01_fmha/CMakeLists.txt#L12-L15 CMAKE issue after #1286 The cmake command: > CXX=/opt/rocm/bin/amdclang++ CC=/opt/rocm/bin/amdclang cmake -DCMAKE_PREFIX_PATH="/opt/rocm" -DCMAKE_BUILD_TYPE=Release -DGPU_TARGETS="gfx942" .. ``` adding example example_gemm_multiply_multiply_xdl_fp16 Traceback (most recent call last): File "/data/driver/composable_kernel/example/ck_tile/01_fmha/generate.py", line 9, in...

Final project submission for Nick Rackley, Casey Morrison, Bryan Gonzalez. Contribution of removing padding from StreamK.

@bghimireamd [Informative] > Had to rename [MIOpen's #define ENV ](https://github.com/ROCm/MIOpen/blob/develop/src/include/miopen/env.hpp#L156) to [#define MIOPEN_ENV](https://github.com/ROCm/MIOpen/blob/426d0a17627e7d6fba555a27be8ffe612c25c97e/src/include/miopen/env.hpp#L156) because it was having name collision with recent introduction of CK's macro with same name [#define ENV](https://github.com/ROCm/composable_kernel/blob/aab4439f0f5e8c2c94c440aa103cecd1b74e3133/include/ck/utility/env.hpp#L127)...

### Problem Description This issue happen in cpp_extension. I use pytorch's cpp_extension to compile the CK instead of regular cmake. When we cast from __half to _Float16 if to compile...

- Add a naive gpu gemm reference kernel - Switch gemm examples to this verification path