zichguan-amd

Results 38 comments of zichguan-amd

Hi @gcongiu, the issue should be fixed in rocprofV3 (see [this sample](https://github.com/ROCm/rocprofiler-sdk/tree/amd-mainline/samples/counter_collection)), please let me know if you can update to use V3.

Hi @lingjiew93, thanks for reaching out. You can use `rocprof --list-derived` to see a list of performance counters and how they are calculated or visit the [official documentation](https://rocm.docs.amd.com/en/latest/conceptual/gpu-arch/mi300-mi200-performance-counters.html#l2-cache-access-counters). `L2CacheHit` is...

Hi @qianfengz, the fix has been merged, closing the issue.

Hi @jinz2014, can you try using `-d` as suggested in https://github.com/ROCm/hipBLASLt/issues/677?

If you have all the dependencies then you can try building manually with cmake, see the docs here: https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/installation.html#manual-build-all-supported-platforms. You can choose to install in a directory that does not...

Hi @crozhon, this feature has been implemented since ROCm 6.2.

Hi @xinji1, the system speed of light section in the documentation mentioned above (https://rocm.github.io/omniperf/performance_model.html#system-speed-of-light) contains all the metrics that we currently support. There is no direct equivalent to `Compute Throughput/...

Hi @gsitaram, the default behaviour is to clear the workload directory if it exists during setup, see [this line](https://github.com/ROCm/omniperf/blob/fb210abcd0133586b0b96bbb99678b6ea8491ef0/src/omniperf_soc/soc_base.py#L206). If you run multiple workloads with the same name, the directory...

Hi @gsitaram, the latest ROCm does not allow comparison from the same workload directory, it won't crash but it will tell you that `You cannot provide the same path twice`....

Hi @chrirocca, I don't have a MI300X at the moment and I was not able to reproduce the problem on MI100. Can you try running omniperf with `-VVV` flag like...