zjing14

Results 14 issues of zjing14

- Use DPP8 Gemm utilizes implemented by @geyyer in (https://github.com/ROCmSoftwarePlatform/composable_kernel/pull/657/) to finish a fp16 Gemm example - @bwroblew

Refer to: https://github.com/microsoft/onnxruntime/blob/main/cmake/patches/composable_kernel/Fix_Clang_Build.patch Need the following changes in CK cmakefile Line: 9 ``` # Check support for CUDA/HIP in Cmake -project(composable_kernel) +project(composable_kernel LANGUAGES CXX HIP) ``` Line: 46 `-link_libraries(hip::device)` Line:...

Updated cache for Jing's optimizations on RNN with GEMM fusion

I found there may be bugs (see below) in the get_distance function of MIOpenGEMM that makes MIOpenGEMM cannot find the best config in the kernel cache. I think there are...

``` ./bin/test_conv2d --half --cmode conv --pmode default --group-count 1 --disable-backward-weights --input 16, 16, 7, 7 --weights 4, 16, 1, 1 --pads_strides_dilations 0 0 1 1 1 1 --trans_output_pads 0 0...

value_low
workaround