Lukas Koestler
Lukas Koestler
Hi @bsteinb thank you for this nice repository! Could you maybe point me towards a resource on which algorithm to choose? I read the original papers, but of course every...
Hi, thank you very much for this cool library. I noticed that `simde_mm512_load_ps` is missing while `simde_mm512_loadu_ps` is implemented. However, both seem to exist for Intel intrinsics (see screenshot). Maybe...
Dear Maintainers, thank you for the awesome library, I really like it :) I have a strange launch failure when using `cub::BlockReduce BlockReduce` together with CUDA Dynamic Parallelism (CDP). When...
Hi @jczarnowski, thank you for this nice open-source project! Working with the code was a pleasure so far! I just tried to build the dependencies (./makedeps.sh) with cmake 3.20.5 and...