PaulGannay

Results 7 issues of PaulGannay

### Is this a duplicate? - [x] I confirmed there appear to be no [duplicate issues](https://github.com/NVIDIA/cccl/issues) for this bug and that I agree to the [Code of Conduct](CODE_OF_CONDUCT.md) ### Type...

bug

See [here](https://github.com/kokkos/kokkos/issues/8080) for a description of what we need. PR opened to allow for discussion on this subject: - 1 - Should we offer new callbacks (`begin_single` and `end_single`) for...

Currently, the only way of requesting that code executes on device is to use one of the parallel construct (parallel_for, parallel_reduce or parallel_scan), but it can happen that one needs...

Micro-benchmark to test: - PerfTest_PtrAccess.cpp: the overhead of accessing Kokkos::View through the parenthesis operator compared to the pointer returned by .data(). - MicroBench_ParallelForOverheads.cpp: the overheads of the various objects creation...

PR to merge the work of @blegouix done in https://github.com/CExA-project/ddc/pull/708. I can't update the original PR since I don't have writing rights over https://github.com/blegouix/ddc/ nor https://github.com/CExA-project/ddc.

The main goal of this benchs is to check that there is no regression regarding performances of launching a parallel_for. Results with g++-13.3.0 ``` --------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------...

SNL-CI-APPROVAL
PerfTest/Benchmark