aspect
aspect copied to clipboard
My 3D simulation becomes much slower when using dealii9.4.2 than using dealii9.3.3
I'd like to show a possible issue that I have met. I restarted a 3D model after updating the dealii version from 9.3.3. to 9.4.2. The simulation becomes much slower than that in the old version. It seems to be a similar issue #4959 reported by @lhy11009.
Any suggestions on why? Thanks!
data:image/s3,"s3://crabby-images/57773/57773f6ea8139b6c2a0fc7e58ece25e602fac55b" alt="new"
data:image/s3,"s3://crabby-images/92116/921169fe786503c1955d2788eba191d164aac717" alt="old1"
data:image/s3,"s3://crabby-images/1ede1/1ede1b23f3656c269eb88e4de6c372cfe8bca26b" alt="old2"
Hi Sibiao, Thanks for letting us know. Can you clarify:
- You say "restarted". Do you mean you restarted a checkpoint that was written with deal.II 9.3.3 with deal.II 9.4.2? While it may work (I am slightly surprised if it does) this is not a scenario we support. Checkpoints should be restarted with exactly the ASPECT and deal.II version that wrote them. If you did restart with a different version: Can you check if the speed difference also exists without restarting a model (e.g. when you start from the beginning and only run a few timesteps)?
- Are the two deal.II versions running on the same cluster with the same modules loaded?
Also in order to diagnose this properly, can you attach the following files (you can attach files by dragging them into the text box of the issue):
- a full log.txt file run with 9.3.3 and one with 9.4.2 (We need to see the timing information as well as solver iteration numbers etc)
- A parameter file
- inside your deal.II directory there is a file called
detailed.log
that contains compile time options. Attach that as well.
Can you check if the same happens with a very simple model? E.g. the shell_simple_3d cookbook? (Take care to change the resolution of the cookbook, or the number of CPUs you use to be in a reasonable range of DoFs/core, similar to your model). If it happens with a simple model that would greatly help us reproduce the issue.
Hi Rene,
Thanks for your quick reply. The following answers your questions:
- Sorry, I used the wrong word. I meant to "re-run" the model from the beginning.
- Yes, they are on the same cluster (HLRN) with the same modules loaded (cmake, GCC, openmpi...).
For simple models (e.g., 2D models or some 3D models that I have run), this is not a problem. I wonder if there is something wrong with the setup of this 3D model? But if so, why does this model work on the old version with dealii 9.3.3? I will try to run the shell_simple_3d model later.
Hi Sibiao, thanks for forwarding this to me. Magali just recompiled 4.3.2 last weekend on peloton. I'll cc her to let her know. I'll test my model run times and report back what I found. Also, the issue you mentioned was tested with the different versions of dealii, but the result is from the different options in vectorization level. In the 9.4.0 deallii compilation, I didn't include the NATIVE_OPTIMIZATIONS. After @gassmoeller mentioned that, I recompiled and the issue was solved. So if that's something related, let me know. And sorry for not closing the issue.
Both of these log files seem to be truncated and do not contain complete timesteps or any timing information. Do the runs hang/crash or did you forget to post the whole file?
Both of these log files seem to be truncated and do not contain complete timesteps or any timing information. Do the runs hang/crash or did you forget to post the whole file?
Hi Timo, I only run the model for 6-7 minutes and killed the job because the model started working very quickly in the old version. The output file is exactly the snapshot I posted at the beginning.
Hey Rene and Haoyuan,
I wanted to update you on the issue we've been discussing. I ran some models in the cookbook, including shell_simple_3d, and they don't have this problem. This makes me think that there might be some errors or irregular settings in the 3D model that are causing the issue, even though it can be run in the same ASPECT version with dealii 9.3.3.
I'm really sorry that I won't be able to investigate further at the moment, as I'm quite busy preparing for my upcoming EGU talk next week and I'm planning to take a vacation afterwards. @gassmoeller, would it be alright with you if I closed this issue for now? Thanks!
would it be alright with you if I closed this issue for now? Thanks!
Please leave it open. We do want to sort this problem out!
@sibiaoliu Any news? Principally, what would be very useful to have is a log file with both deal.II versions that shows the timing output that ASPECT provides every few time steps (along with at the end of the run).
deal.II 9.5 was released a few days ago. It would of course also be interesting to see whether that made a difference.