kngott

Results 9 comments of kngott

Thank you for the concise description of the problem! It was extremely helpful. We just pushed a new pull request to development that should fix the problem. Please pull development,...

I agree with Axel. I'd use the TinyProfiler, which uses a high-precision timer, only stores the data when triggered, and does reductions in Finalize, for test codes in the AMReX...

Which NCSA Delta partition was this? The 4 GPU A100 nodes, or a different partition?

👍 Do you also happen to know (or can you find out) how many NICs it has per node and can you confirm that's Slingshot 10?

Makes sense, thanks! So, sounds like the strongest possibilities are either affinity differences, or the OpenMPI+UCX implementation of CUDA-Aware is better. It would be really good to lock down the...

For the `use_profiler_syncs=1` : Makes sense to me. Just a note for us for future testing. For the MPICH: yeah, that tracks: two systems, each with a different, unique MPI...

This is associated with the AMReX PR: https://github.com/AMReX-Codes/amrex/pull/2891 So, including that in the build would also show particle comm data.

`c=1` in the regression testing did catch my eye. :) Until a more general solution to the regression testing is worked out, I guess the best thing to do is...

I understand the need to limit the number of things you have to maintain. It's completely up to you. In my experience with scientific codes, you really only need CMake...