Daniel Arndt

Results 823 comments of Daniel Arndt

We should discuss in the next meeting, though.

Note that you are not actually using the execution spaces in your example but only their respective memory spaces to use an execution space instance, you would do ``` auto...

> Do you mean the signature that accept an execution space are asynchronous? This is not specified in the documentation, to my understanding of it. We are saying > If...

@pzehner Do you have any use case for calling `create_mirror_view` with a `View` with `const` value type? The two alternatives we came up with yesterday were: - disallow calling `create_mirror_view`...

> Let me check with my colleagues. The only case I can think of is if you are inside a function where a View is passed with `const` values, by...

> We will see about that. Some might not be really happy about the serial parallel scan taking a 2x slow down. I'd be curious to hear under which circumstances...

> Otherwise why not just run OpenMP with 1 thread? The `Serial` backend doesn't require any dependencies whereas all other backends do so it's a good fallback and starting point.

@romintomasetti With ```C++ #include #include void test_fence_with_kokkos(::benchmark::State& state) { using ExecutionSpace = Kokkos::DefaultExecutionSpace; ExecutionSpace exec_space; for (auto _ : state) { Kokkos::parallel_for(1, KOKKOS_LAMBDA(int i) {}); exec_space.fence("blabla"); // will use the...

Without submitting any work, I'm seeing ``` --------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------- test_fence_with_kokkos 1622 ns 1622 ns 432608 test_fence_backend_native 1534 ns 1534 ns 457689 test_global_fence_with_kokkos 813 ns 813 ns...

I'm arguing that the numbers without parallel regions don't really matter. We shouldn't fence if there is nothing to fence anyway.