Castro
Castro copied to clipboard
output the location where the timestep is set
PR summary
addresses #328
PR motivation
PR checklist
- [ ] test suite needs to be run on this PR
- [ ] this PR will change answers in the test suite to more than roundoff level
- [ ] all newly-added functions have docstrings as per the coding conventions
- [ ] the
CHANGES
file has been updated, if appropriate - [ ] if appropriate, this change is described in the docs
CPU tests are good: http://groot.astro.sunysb.edu/Castro/test-suite/gfortran/2022-08-24-001/index.html
GPU tests: http://groot.astro.sunysb.edu/Castro/test-suite/gpu/2022-08-24-001/index.html
this changes some answers -- not sure why. But I saw the same phenomena in test_react.
running on Summit, I see essentially no difference in the runtime doing a 3-d flame_wave for 100 steps
note: all the limiters are now done except for radiation -- that one is a bit more complex, since it has 2 separate boxes.
regarding the GPU diffs, I wonder if TilingIfNotGpu
is the issue?
updated tests: http://groot.astro.sunysb.edu/Castro/test-suite/gfortran/2022-08-29-001/index.html
Is this test (http://groot.astro.sunysb.edu/Castro/test-suite/gfortran/2022-08-29-001/nova.html) the issue? There should not be any roundoff errors in min and max.
this is one of the failures: http://groot.astro.sunysb.edu/Castro/test-suite/gpu/2022-10-27-002/flame_wave.html
it only has diffs on GPUs
Oh. This, I guess. http://groot.astro.sunysb.edu/Castro/test-suite/gpu/2022-08-24-001/dustcollapse-restart.html
OK. I will take a look.
I think the roundoff errors come from the code that computes the number for reduction. The results agree with each other if I run in debug mode.
thanks @WeiqunZhang !
okay. Is there a ParallelReduce that can work on ValLocPair<double, amrex::IntVect>
then?
We don't. We need to create custom MPI_Datatype and MPI_Op for that. For a generic ValLocPair<T1,T2>
, there is probably not much amrex can do. However, ValLocPair<double,amrex::IntVect>
might be the most one that we can add support in amrex.
okay, I think I need something like that, because otherwise, I am not sure how to get the location associated with the dt reduced along with dt.
actually, I guess I can do it manually, but keeping the reduce on dt and then having each proc output the location if their local reduced dt is the global one.
https://github.com/AMReX-Codes/amrex/pull/3003
With the PR above, you can do
ParallelReduce::Min(r, ParallelDescriptor::IOProcessorNumber(), ParallelDescriptor::Communicator());
ParallelAllReduce::Min(r, ParallelDescriptor::Communicator());
tests pass: http://groot.astro.sunysb.edu/Castro/test-suite/gfortran/2022-10-29-003/index.html
this is ready for review
AMReX module updated