gpu_performance_api
gpu_performance_api copied to clipboard
Unable to access streaming counters with Vulkan?
Hello there,
Calling GpaGetSupportedSampleTypes on my GPU (RX 7600 XT) with Vulkan only returns kGpaContextSampleTypeDiscreteCounter but not kGpaContextSampleTypeStreamingCounter, so I'm not able to access streaming counters like WaveOccupancyPct.
Is there anything special to deploy or to tweak in order to access such counters?
Edit: Looking at the code, it is indeed hardcoded to only support discrete:
https://github.com/GPUOpen-Tools/gpu_performance_api/blob/a0852bceb250ff18e76f631496872f1cd32cd5d1/source/gpu_perf_api_vk/vk_gpa_context.cc#L29
Seems removed here:
https://github.com/GPUOpen-Tools/gpu_performance_api/blob/a0852bceb250ff18e76f631496872f1cd32cd5d1/source/public_counter_compiler/CounterCompiler.cs#L1150-L1161
While the VK extensions seem to have a few stuffs related to spm:
https://github.com/GPUOpen-Tools/gpu_performance_api/blob/a0852bceb250ff18e76f631496872f1cd32cd5d1/source/third_party/AmdVkExt/vk_amd_gpa_interface.h#L137-L146
https://github.com/GPUOpen-Tools/gpu_performance_api/blob/a0852bceb250ff18e76f631496872f1cd32cd5d1/source/third_party/AmdVkExt/vk_amd_gpa_interface.h#L190-L214
So, what needs to be done to support streaming? Is it possible with the internal codebase by replicating what is done for d3d12 to Vulkan or it requires internal changes to some AMD drivers?
At this time GPUPerfAPI does not support streaming counters on Vulkan API. I would not recommend attempting to implement it, because we have also not released any information about how you can process the resulting buffer of data. GPUPerfAPI only supports Streaming counters on DX12 for the use of our Microsoft PIX plugin.
As noted in our README.md:
"GPUPerfAPI now includes additional entrypoints and enums to support querying streaming counters and SQTT data. These are only intended to support the AmdRadeonPlugin in Microsoft PIX on Windows. No support will be offered on how to use these entrypoints."
We can look into whether we can create an alternative equation for WaveOccupancyPct that works as a Discrete counter. The current equation only works for Streaming counters due to the way the underlying hardware counts those events.
Alternative Recommendation - Use our Radeon GPU Profiler to capture & analyze a frame of your Vulkan application. You'll get a very nice visual of the occupancy throughout your frame, plus a lot more insight than what GPUPerfAPI can provide!
We can look into whether we can create an alternative equation for WaveOccupancyPct that works as a Discrete counter. The current equation only works for Streaming counters due to the way the underlying hardware counts those events.
I would be definitely interested in these counters being exposed (all the CS limiters...etc.). Hope that your team can find the time/priorities for this!
Alternative Recommendation - Use our Radeon GPU Profiler to capture & analyze a frame of your Vulkan application. You'll get a very nice visual of the occupancy throughout your frame, plus a lot more insight than what GPUPerfAPI can provide!
Thanks, yeah, I tried it, but unfortunately, it doesn't work as per the issue here: https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/111 (if you can help nudge someone at AMD, I would appreciate ☺️)
Nonetheless, using GPUPerfAPI, I was able to get all counters straight into my benchmark and I can iterate a lot more quickly:
It is particularly interesting in CI/CD scenarios as well.
I absolutely love that you've incorporated it into your benchmark and are using that information to iterate on your development! As a side comment, if you move to Radeon RX 9000 Series, be aware that the L1 counters do not exist. The L1 on that hardware is not actually a cache, so there are no Hits or Misses.
Noted on the usefulness of the CS Limiters counters as well. We'll see what we can do with them!
I've passed along your RGP bug report to the RGP lead. RGP 2.4 specifically made improvements for Compute-only Vulkan apps. https://github.com/GPUOpen-Tools/radeon_gpu_profiler/releases/tag/v2.4 "3. Support for profiling non-frame-based pure compute DirectX® 12 and Vulkan® applications (requires a 24.30-based driver or newer)"