RookieT0T
RookieT0T
> Hi @RookieT0T, can you share the workload that you are trying to profile? It's normal to have 0 L2 hit rate if your workload doesn't reuse any cached data....
> Hi @RookieT0T, can you share the workload that you are trying to profile? It's normal to have 0 L2 hit rate if your workload doesn't reuse any cached data....
Are there any progress?
> @RookieT0T please try using rocprofv3 from the new [rocprofiler-sdk](https://github.com/ROCm/rocprofiler-sdk) package instead of rocprofv2. rocprofv2 was always a beta and the design of the underlying rocprofiler v2 library was problematic...
> @RookieT0T please try using rocprofv3 from the new [rocprofiler-sdk](https://github.com/ROCm/rocprofiler-sdk) package instead of rocprofv2. rocprofv2 was always a beta and the design of the underlying rocprofiler v2 library was problematic...
Hi, I am glad to hear that. To reiterate my problem, the cache hit is never reported in docker images of rocm 6.3.0 and older versions except version 4.0 or...
Hi, all. I just tried the docker [image](https://hub.docker.com/layers/rocm/dev-ubuntu-22.04/6.3.1-complete/images/sha256-b3bf4a771a7d1048cf22b31eb289b390feb959c2813be94bfd4fbda9c4206b51) with 6.3.1. Unfortunately, the result of using rocprofv3 showed that the cache hit was still 0. Have you tried my example workload...
> [@RookieT0T](https://github.com/RookieT0T) I looked into your assembly (note: please format in a code block in the future) and it is unclear why you are expecting data to be in the...
> [@RookieT0T](https://github.com/RookieT0T) If I use this code: > > #ifdef NDEBUG > # undef NDEBUG > #endif > > #include > > #include > #include > #include > #include >...
> The previous TCC_HIT you were seeing is likely instruction fetch or similar. It should only hit on the second load: > > **global** void kernel(int * arr) { uint64_t...