carlushuang

Results 28 comments of carlushuang

@wjc404 thanks for your interest. BTW, what's the MNK parameter of the test above?

A brief description of this PR: * fix kernel profiler on rocm * bert embedding(embedding+layernorm) fusion support on rocm * `conv2d_bias_add` fusion support on rocm * `bmm_*_add` fusion support on...

Hi Du. For AMD backend, the upstream AIT breaks a lot of things. We fixed them in our AMD fork of AIT currently: https://github.com/ROCmSoftwarePlatform/AITemplate, and plan to upstream in the...

@duli2012 sorry for late reply. This is due to not applicable kernel will interrupt current profiling. But it should not affect the whole build process. In the end this example...

@yinghai @ipiszy Thanks for mention the rocm backend CI. We have setup CI in the [rocm fork ](https://github.com/ROCmSoftwarePlatform/AITemplate) and it works fine. There are several bug fixes to let everything...

Hi @patflick and all It's about 2 months since last discuss, but I encountered the same issue with tip code. My env is ubuntu 16.04 + manually installed gcc-7.3.0. To...

https://github.com/patflick/miopen-benchmark/pull/13 @patflick Hi I make a quick work around for this issue, that not use regex to check the device name. If it's not acceptable just drop it.

For updating the kernel structure list, a possible approach could be: ``` static const std::vector kernel_param_list { #include }; ``` Then only need to update the structure inside the header...

> 1、In the generated code, there is no instruction to output the result what do you mean by output the result? Is it print the result of the output buffer?...

Hi @atamazov it seems @shurale-nkn added that `.amdhsa_reserve_xnack_mask` in every asm kernel source code in MIOpen, however it still seems the xnack default value is target-dependent. Maybe I can add...