alpaka
alpaka copied to clipboard
Abstraction Library for Parallel Kernel Acceleration :llama:
While working on #1713 I discovered that the Boost.fiber back-end is broken when enabling C++20. This was fixed in their repository a few days ago but all stable versions including...
The example is not ready yet, but here is the current version. I feel for now the memory part is awkward and I have to somehow reformulate it. The goals...
AccCpuThreads is currently a bad showcase of C++11 threads as it uses the sub-optimal strategy of spawning CPU threads at thread instead of block level just like the equally useless...
CUDA 11.7 is released: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
Currently `AtomicStdLibLock` has a static mutex hash table which allows it to synchronize between all grids executed within a process. However, this is not documented and does not conform to...
This is the follow up to an offline discussion. It is not yet a real issue. Alpaka enforces that kernel arguments are either taken by `value` or by `const &`....
By executing the fibers randomly we prevent memory prefetching. Iterating X 1st, Y 2nd, Z 3rd (native C memory order) we would assist the prefetcher by using the expected default...
Enhance the fibers implementation by parallelizing the execution of the blocks.
I realized https://github.com/alpaka-group/alpaka/pull/1707#discussion_r867812402 that in cases the alpaka device is destroyed the device is not correctly freed. The class https://github.com/alpaka-group/alpaka/blob/b074b0df68a96321dc73261ab2b9d3d41180f18c/include/alpaka/dev/DevUniformCudaHipRt.hpp#L62 should call `reset()` which is calling `cudaDeviceReset()/hipDeviceReset()` and guarantees that...
#1686 introduced the macro `ALPAKA_DEFAULT_HOST_MEMORY_ALIGNMENT`. We should document it properly and consider making it available via cmake. _Originally posted by @bernhardmgruber in https://github.com/alpaka-group/alpaka/issues/1686#issuecomment-1096531865_