schung-amd comments

Results 157 comments of


                                            schung-amd

[Issue]: roctracer_record_t returned device_id are off by 2. Devices are enumerated 2 to 9 instead of 0 to 7.

Hi @aaronenyeshi, as you've noted in https://github.com/pytorch/kineto/pull/926, this is due to roctracer enumerating the CPU as well as the GPU devices. This is by design; roctracer is pulling the node...

CK_BUILD_JIT_LIB issue

Hi all, I was able to reproduce this issue. Following the instructions to build `migraphx` with cmake at https://github.com/ROCm/AMDMIGraphX, I saw the same error while running the command `CXX=/opt/rocm/llvm/bin/clang++ cmake...

CK_BUILD_JIT_LIB issue

I've reached out to the `MIGraphX` team, and we do currently rely on the specific commit of `composable_kernel` being pulled in, as it has features that were not added into...

stream create, copy and destroy example

Hi @jinz2014, `hipMemcpyAsync` (and `cudaMemcpyAsync` on the CUDA end) are asynchronous with compute operations but not necessarily memory copy operations; only one copy can be executing at a time per...

[Issue]: Cannot register Static Global Var on inline variable

Hi @tpadioleau, sorry for the delayed response. This fix isn't in a release yet as far as I can tell, but I can keep tabs on this and update you...

[Issue]: Cannot register Static Global Var on inline variable

Never mind, I was passing the wrong options to tar, the file is fine. Can confirm that this is not fixed as of ROCm 6.2.2, I'll update you when the...

[Issue]: Cannot register Static Global Var on inline variable

Sorry you're blocked by this issue. I can't make any promises, but we're looking into getting this fix into the next major release.

[Issue]: Cannot register Static Global Var on inline variable

@erayinanc @tpadioleau The fix should be in ROCm 6.3 from what I can tell, thanks for your patience! I'll leave this open for now for confirmation upon the release of...

[HIP][device] 4 __shfl_sync functions are missing

Apologies for the unclear documentation. These functions are available and disabled by default in 6.2 as stated, usable via a preprocessor macro. If there are issues with their functionality, feel...

HIP does not `#pragma unroll` loop in some cases

Hi @shoshijak @doru1004, thanks for identifying this issue. HIP currently supports unrolling loops with bounds that are defined at compile-time; see https://rocm.docs.amd.com/projects/HIP/en/docs-6.0.0/reference/kernel_language.html#pragma-unroll. In this case, mn is defined at run-time,...