level-zero icon indicating copy to clipboard operation
level-zero copied to clipboard

[Question] Shared and Host Buffers can offer the same overall performance on Intel Integrated Graphics?

Open jjfumero opened this issue 2 years ago • 1 comments

I am interested in analyzing the overall performance (end-to-end applications) when using different types of buffer allocation. I wrote this blog-entry for reference:

https://jjfumero.github.io/posts/2022/05/overall-performance-of-unified-shared-memory-level-zero/

What I saw was that running an application with host buffers offers the same performance as running with shared memory buffers. My understanding is that, when running applications using shared memory buffers, the GPU driver can migrate the buffers from the host to the device, while host memory will be accessed from the device every time a data item is required. I have two scenarios: a) memory-bound and b) compute-bound. I was surprised to see that, when running the memory-bound case, the overall performance was very similar when allocating buffers using host memory only, and shared memory only. Is this performance expected when running on Intel Integrated graphics?

If you want to reproduce all numbers, the whole application is available here: https://github.com/jjfumero/codeBlogArticles/tree/master/may2022/sharedMemoryEffect

jjfumero avatar Jun 10 '22 11:06 jjfumero

thanks @jjfumero . You are correct. However, it would all depend on what the test does and the underlying support. Since this is more of a question of how this is implemented in the L0 driver, rather than how it is defined in the spec, would you mind moving this issue to the driver implementation repo for Intel GPUs https://github.com/intel/compute-runtime ?

jandres742 avatar Jun 10 '22 14:06 jandres742