alpaka
alpaka copied to clipboard
Abstraction Library for Parallel Kernel Acceleration :llama:
While porting the CMS pixel reconstruction from native CUDA to Alpaka, it was noticed that the use of the `alpaka::getWarpSizes(device)` function incurs a noticeable overhead. See https://github.com/cms-sw/cmssw/pull/43064#issuecomment-1817590926 for the discussion....
#1958 disabled our Windows+CUDA CI because of a bug in the Windows `nvcc`. Once this is fixed we should reenable the CUDA-on-Windows jobs.
The CPU atomic implementation using `std::atomic_ref` use a sequentially consistent memory ordering, which is a stronger guarantee than their CUDA counterparts, which are weakly ordered and always require explicit fences....
That is a tough topic.. first a quick look into the usage of device global variables in SYCL (since [oneAPI 2023.2](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_global.asciidoc)): ```cpp // declaration of the variable sycl::ext::oneapi::experimental::device_global myGlobVar; int...
Currently, all (but one analysis) Debug CI jobs run with `alpaka_DEBUG=0`. This means, that extra debugging code is never tested by the CI. We should add at least a few...
Preface: This issue is not about relicensing alpaka or stripping people / institutions of their (copy)rights. Its purpose is simply to be a clarification of the current legal state. While...
I didn't expect, that it will work now, but it is interesting.
_Originally posted by @AuroraPerego in https://github.com/alpaka-group/alpaka/pull/2140#discussion_r1316475428_
Alpaka is working on ARM and we have an ARM CI runner. Unfortunately, we cannot use the pre-build container for the ARM job, because they are build for the x86...
At the moment, the CI does not run `ctest` on the CPU runner if CPU backends is used on GitLab CI. This is a left over from the beginning of...