Christian Trott

Results 519 comments of Christian Trott

Callbacks: good question now sure what the best thing is. Probably the new callbacks is the right idea. We could do something where if we don't find the new callbacks...

Thanks for the report, I guess we need to figure out what the difference in the build environment is. Might also be something AMD could look at.

This example needs to be completely reworked, it's meaningless right now - NVCC will use only 26 register for the kernel either way.

I still don't get fully why we need that force host option, in particular if it's ignored if you set a GPU Arch. Is it that you are saying without...

Some more context: its semantically incorrect for Kokkos users to access a const element type view and expect to see updates from currently running threads on an aliasing non-const element...

8.6 and 8.9 have maximum of 1536 threads active: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications-technical-specifications-per-compute-capability 7.5 has only 1024. I guess we could fix it in there by ifdefing stuff? If we don't ifdef we...

I updated the title: clearly the problem is neither `get_shmem` nor the `atomic_add` you start with a valid pointer, and you fail `atomic_add` when you produced an invalid one.

Adding: ```c++ if (team.team_rank() == 0) { Kokkos::printf("shmem=%p\n", shmem); } ``` Before the `if constexpr` stuff makes the problem go away ...

We need to report this to Intel I also see the issue with 2024.1.