Daniel Arndt
Daniel Arndt
Part of #6091.
As discussed in https://kokkosteam.slack.com/archives/C5BGU5NDQ/p1697194377868819, `TeamThreadMDRange`, `TeamVectorMDRange`, and `ThreadVectorMDRange` `parallel_reduce` don't actually accumulate values across threads, they rather behave as a `parallel_for` with a thread local, user-provided variable. In the end,...
In the documentation for [Parallel computing with multiple processors using distributed memory](http://www.dealii.org/developer/doxygen/deal.II/group__distributed.html) we say > [...] > Both the PETScWrappers::MPI::Vector and TrilinosWrappers::MPI::Vector class support specifying >this information (see step-40 and...
Based on top of #234. This pull request makes the `icpx` test with `SYCL`. I also added `--output-on-failure` to `CTest` and needed to allow `CL/sycl.hpp` instead of `sycl/sycl.hpp` for older...
@trilinos/kokkos @bartlettroscoe ## Motivation Currently, we are essentially using three build systems in Kokkos: - Makefile-based (only supported for using Kokkos inline, not standalone) - raw CMake - through Trilinos...
The functionality is similar to #542 but we perform potentially multiple queries per work item (using the previous result as a hint).
Forcing the workgroup size for the tree traversal to 1024 for `SYCL`, improves the tree traversal by 25% in the DBSCAN benchmark.
`KOKKOSTOOLS_LIBRARY_MODE` isn't a CMake variable (and not documented) and there doesn't seem much of an advantage over using the implicit `BUILD_SHARED_LIBS` option. Note that `kp_add_library` expects all extra arguments to...
Tests would normally just use environment variables to load the respective tool. In any case, we shouldn't link to `kokkostools` unconditionally. Tests that set the respective hooks explicitly would not...
This might improve compiler performance. From https://github.com/intel/llvm/blob/sycl/sycl/doc/UsersManual.md: > -f[no-]sycl-rdc Enables/disables relocatable device code. If relocatable device code is disabled, device code cannot use SYCL_EXTERNAL functions, which allows the compiler to...