alpaka
alpaka copied to clipboard
Abstraction Library for Parallel Kernel Acceleration :llama:
Move template TPlatform as the last template. There is no need to provide the platform template signature if we pass the platform as an instance. follow up of #2162
If saw, that the table for the section `Memory Management` in the CUDA Runtime API is missing: https://alpaka.readthedocs.io/en/latest/dev/backends.html#cuda-runtime-api At a local build, I saw the table has a format issue....
Compiling with clang and the single source file included gives me these errors: ``` In file included from :4: /app/raw.githubusercontent.com/alpaka-group/alpaka/single-header/include/alpaka/alpaka.hpp:26621:67: error: template template argument has different template parameters than its...
missing cmake documentation for PR #2237
`GetDynSharedMem::getMem(acc)` is defined as: https://github.com/alpaka-group/alpaka/blob/9b15e664d103c581020aa5285171b67483eb5c59/include/alpaka/block/shared/dyn/BlockSharedMemDynUniformCudaHipBuiltIn.hpp#L38-L46 1. if the concern is that the memory may not be aligned enough for `T`, why not declare it as ```c++ extern __shared__ T shMem[];...
`getValidWorkDiv` has a bug. It only considers device _hard_ properties ( `TApi::getDeviceProperties()`); does not consider the kernel function. Actually kernel function properties can limit number of threads per block. In...
During my work on PR #2180 I had some trouble to add the memory visibility on the correct concepts. Therefore I had a offline discussion with @psychocoderHPC and started to...
PR #2273 fixes the broken `alpaka_RELOCATABLE_DEVICE_CODE` feature. The [separableCompilationTest](https://github.com/alpaka-group/alpaka/blob/develop/test/integ/separableCompilation/CMakeLists.txt) test tests only `alpaka_add_executeable`.
- `alpaka::meta::isTuple`: checks if a given type is a `std::tuple` or not - `alpaka::meta::toTuple`: pack a arbitrary number of types in a `std::tuple`. If the given type is a `std::tuple`...