Daniel Arndt

Results 805 comments of Daniel Arndt

Is it gcc < 10.40 or gcc < 11.1.0? I guess you meant a different compiler for the second version number?

> No, gcc keeps patching multiple major versions at the same time. Based on the timeline of their releases I suspect they fixed the issue in gcc/12.0.0 and then back-ported...

> a runtime error occurs when calling `MatrixFree::cell_loop()`. The output messages are: What does `cuda-gdb` say? Are you configuring `Kokkos` with `Kokkos_ENABLE_DEBUG_BOUNDS_CHECK=ON`?

Does the suggestion to use `size()` instead of `capacity()` work for you for the example given in this issue (independent of the feature requests)?

I can't reproduce on my Mac trying various commits over the development cycle of the last release. I always see around 1.3s run time (12s with bouds checking on). The...

On Sapphire Rapids the run time is like 2.5s both with `develop` and the 4.3.01 release.

> Hang on, does Kokkos not auto-detect the native CPU architecture if no Kokkos_ARCH_... flags are passed? No, we don't. You can use `Kokkos_ARCH_NATIVE=ON` to get `-march=native -mtune=native` (and auto-detect...

Can you run `gprof` or `VTune` for `gcc 11.4` to understand if it's really the locks?

My best bet would be that we are missing out on some compiler optimizations due to the lock and that it's not the lock itself that makes the difference. Is...

> This replacement for `std::lock_guard` does not cause the regression (performs like 4.3). So we assume the optimization is not a memory access reordering?