ILGPU
ILGPU copied to clipboard
ImmutableView support
I know that there is a support for ImmutableArray which content is copied to GPU const memory. However it must be a static variable what is not the best solution at least for my library. At least I was not able to set such ImmutableArray as kernel parameter - when I call LoadAutoGroupedStreamKernel it throws exception:
Not supported kernel-parameter type 'System.Collections.Immutable.ImmutableArray`1[System.Single]'
It would be much better if I could use ImmutableArray as input parameter to the kernel. Or maybe much better to have something like ImmutableView. Its content would be automatically copied to GPU const memory on kernel function call and from the kernel function it could be accessed as other View types.
@CsabaStupak thanks a lot for your feature request. I think that this feature is also related to #415 and #463. We were also thinking about a separate API for ConstantMemory
in general. I am not sure whether passing arrays via kernel parameters would be the best way of storing them in constant memory... 🤔
I don't mind if the kernel will have to explicitly transfer the variables from GPU memory to GPU constant memory - similarly as I do today when I transfer data to GPU cache. The key point here is to somehow get data from CPU or GPU memory to GPU constant memory which is AFAIK much faster (even faster than GPU cache?). I can even speedup my kernels since I can use GPU cache for something different than const values.
The static approach is not the best way since my "const" arrays are changed between kernel calls but they do not change during the kernel call. So even today I have these arrays in GPU memory. Moreover for multi-thread case - when I create two accelerators, I have problem with static ImmutableArray approach.
This could potentially be split into two parts. The first part is adding support for ImmutableArrayView
, which would be useful by itself in ensuring that the code does not modify the contents of an ArrayView. Then, the second part, adding support for constant memory. We would need to decide if this is automatically handled by ILGPU, or explicitly annotated by the caller.
My vote is for automatic handling. Today if I call a kernel method I provide its arguments - view and even other non-view/const values. I guess ILGPU automatically move these values to GPU const memory or to GPU registers. I suppose that I can't move the GPU memory content to the GPU const memory inside the kernel since it should be const during the kernel call. However I do not see the whole picture here :-).
I also hope that I will be able to use SubView
and As2DView
/As3DView
methods on this ImmutableArrayView
so I do not need to think about 2D/3D arrach indexing logic :-P
@CsabaStupak We have internally discussed adding support for ImmutableView
types in several sessions. We believe that adding a separate type (namely the ImmutableView
type) will give our internal optimizer additional knowledge about read-only sections in memory and thus (potentially) leading to improved runtime performance. However, we currently have plans to add this feature in a different way, by providing an explicit AsImmutable
function for views:
var immutableSubView = data.SubView(...).AsImmutable();
// Avoid potential read-write dependency issues from here on
int myReadOnlyData = immutableSubView[...];
...
In addition, we plan to add support for automatically moving array data (accessed read-only within the kernel) to sections in constant memory to improve performance.
We are closing this issue for now and will provide PRs when ready. Thanks again for your feature request 👍.
Hi guys, I love the progress on the library. I wonder if there is any update on this? Many thanks! :-)
@m4rs-mt what did you want to do with this feature request? what should we try to implement?
It would be great to have a way to specify that my Array should go to GPU cont memory. AFAIK the kernel function non-array arguments (mostly primitives int, float, etc.) and ImmutableArray go to cont memory. This have a big limitation when I have an array which is changed by one kernel and in another kernel it is not changed at all (read-only).
Maybe the solution is to extend the kernel configuration where I specify the kernel dimension and allocate the shared memory. Here we could allocate the general read/write shared memory and also could allocate the read-only (const) shared memory which would utilize the GPU const memory.
In the kernel function then I could access the allocated read/write shared memory and transfer the data from array/view to this memory as I do it today. But I could also access the read-only/const shared memory and transfer array/view content to this memory.
So it looks like that we do not need the ImmutableArray at all. Only exception is when such array is automatically copied to GPU const memory before the kernel function is called. I guess this is what happens today. However the above kernel config approach could give more freedom. I could use any array/view to transfer its content to GPU const memory in the kernel.
@CsabaStupak Have you tried using a struct
instead of ImmutableArray
?
If you used a struct
with a fixed-size buffer as the kernel parameter, it would be copied over into constant memory on the GPU.