René Widera
René Widera
IMO as @fwyzard said we should take care that 32bit and 64bit methods are available because others are not supported on HIP/CUDA. Increment and decrement with range is nice to...
Currently, alpaka is stateless besides a few `static` objects. If I read this issue and remember the discussions about the static objects in alpaka the question is IMO does alpaka...
Thanks for opening the issue. From the first read, I am not sure what will be the best solution to handle the issue. We will discuss it in our developer...
In all other projects, we use option 3 with the range.
I removed the explicit peer copies in the past https://github.com/alpaka-group/alpaka/pull/1400 because `cudaMemcpy*` is doing it automatically. There was no need to fiddle around with the peer copies anymore. Looks like...
> Maybe we could consider a different approach, that supports both compile-time and run-time dimensions -- similar to how Eigen supports compile-time and run-time sized metrices ? I like the...
After discussion https://github.com/alpaka-group/alpaka/pull/1820#discussion_r1086382178 I think this method can be useful but is very dangerous for the user to use. There are two cases - the buffer is visible on the...
I think the function `canZeroCopy` should stay because this function is saying if you need to copy data or if it is directly visible but I suggest instead of using...
@ivandrodri Sry for the late response Do you solve this issue already? I never tried peer mem copies but alpaka should do the job transparently for you. A simple `cuplaMemcpyAsync`...
@fwyzard What is the test case you used for the performance measurement?