aspect icon indicating copy to clipboard operation
aspect copied to clipboard

Fixup newton solver and elastic rheology

Open YiminJin opened this issue 1 year ago • 25 comments

This PR is a simple solution for issue #5555. The purpose is to fixup some problems related with the Newton solver.

The major modifications are listed as follows:

  • Modify the SPD factor. This modification is based on the fact that in ALL the material models in ASPECT, the constitutive relation is isotropic and the final viscosity can be expressed as a function of the second invariant of $\dot{\boldsymbol{\varepsilon}}$. Define $E \equiv \dfrac{\partial\eta}{\partial\dot{\boldsymbol{\varepsilon}}}:\dot{\boldsymbol{\varepsilon}}$. When $E < -c_{\text{safety}}\eta$, the new SPD factor is expressed as $$\alpha =c_{\text{safety}}\left|\frac{\eta}{E}\right|.$$ In fact, the expression of $E$ can be further simplified to $$E = \frac{\partial\eta}{\partial\dot{\varepsilon}}\dot{\varepsilon},$$ where $\dot{\varepsilon}$ is the second invariant of $\dot{\boldsymbol{\varepsilon}}$. Using this expression, we don't have to calculate the derivative of $\eta$ with respect to each component of the strain rate tensor. Here I preserve the original expression in case that new material models with viscosity unable to be expressed as function of $\dot{\varepsilon}$ are introduced to ASPECT in the future.

  • Add viscoelastic_strain_rate in ElasticOutputs and modify some member functions in class rheology::Elasticity. This is because we need $\dot{\boldsymbol{\varepsilon}}^{\text{ve}}$ when the Newton method is applied.

  • apply deviator() to the strain rate in system jacobian and include the contribution of elastic shear when elasticity is enabled. The reason has been illustrated in issue #5555.

I have tested the modified codes with the gerya_2019 benchmark. The prm files for VP and VEP models with Newton method are gerya_2019_vp.txt and gerya_2019_vep.txt respectively. Although the results look correct, it is still possible that there are mistakes in my modifications. Please let me know if anyone finds problems.

Before your first pull request:

  • [X] I have read the guidelines in our CONTRIBUTING.md document.

For all pull requests:

  • [X] I have followed the instructions for indenting my code.

For new features/models or changes of existing features:

  • [X] I have tested my new feature locally to ensure it is correct.
  • [ ] I have created a testcase for the new feature/benchmark in the tests/ directory.
  • [ ] I have added a changelog entry in the doc/modules/changes directory that will inform other users of my change.

YiminJin avatar Feb 17 '24 14:02 YiminJin

@YiminJin - Thanks for the contribution and I'm excited to see how this improves the convergence behavior in this class of models. Can you briefly summarize here (or post an image) of how the convergence behavior varied when these changes were applied?

Question - would it be possible to try applying these updates on top of the revised VEP formulation effort (#4370) @anne-glerum has been leading? Perhaps easiest to discuss live at an ASPECT user (or other) meeting?

We are still struggling with periodic poor nonlinear convergence behavior in some of the most complicated setups when using #4370, and perhaps your proposed changes will fix this.

Thank you again for the contribution and looking forward to discussing further!

naliboff avatar Feb 17 '24 19:02 naliboff

@naliboff Thanks for your comment. I have plotted a diagram to compare the convergence rates of DCP method and Newton method (with old and new SPD factor) in the first 10 time steps of the gerya_2019 benchmark: relative_residual.pdf In all the experiments, the maximum number of nonlinear iterations in each time step is set to 100. As seen, when the shear bands start to develop (at time step 4), the convergence rate of Newton method with the new SPD factor is the fastest, while the DCP method is the slowest. There is one exception -- time step 8, when the residual seems to diverge, because the shear bands start to bifurcate at this time step. I didn't make comparison between this branch and the main branch, because I have made some modifications that change the result. There are some lines of codes in the main branch that are problematic in my view, such as:

  • The strain rate in the system Jacobian, which has been mentioned.
  • The elastic force. In the original code, the elastic force is calculated by $$\boldsymbol{F}^e = -\frac{\eta^{ve}}{G\Delta t}\boldsymbol{\tau}^{\text{old}},$$ which is the same as Eq. [30] in the original paper of Moresi et al. (2003). However, the above expression only works for viscoelastic rheology; when plasticiticy is taken into account, the expression should become $$\boldsymbol{F}^e = -\frac{\eta^{vep}}{G\Delta t}\boldsymbol{\tau}^{\text{old}}.$$

The modifications above both change the linear system (the elastic force affects the first nonlinear iteration step for defect-correction methods), hence there are significant differences between the results obtained by this branch and the main branch . Nevertheless, the modifications are good for convergence -- the relative residual of DCP method (in 100 nonlinear iterations) is decreased by one magnitude comparing to the main branch.

Question: I have eyeballed #4370, but there are so many modifications that I cannot get the key points. Could you please tell me what the modifications are mainly for? And what is the "correct stresses" in the title?

YiminJin avatar Feb 18 '24 15:02 YiminJin

@YiminJin - Thanks for providing the plots of convergence behavior and this is really promising!

In short, the stresses in the current VE implementation diverge very slightly from simple analytical solutions, and the problem becomes more pronounced for VEP problems with strong strain localization. #4370 is a fix for these issues, which ended up being a major overhaul. I think it would be best to meet with the group working on the VEP revision live, and then we can decide how best to proceed with this PR.

I had a quick look at this PR, and early next week will try to see how easy it would be to implement your changes into #4370 (or whether they should be merged first).

Cheers, John

naliboff avatar Feb 21 '24 18:02 naliboff

FYI @MFraters @bangerth

tjhei avatar Feb 22 '24 16:02 tjhei

Thanks for your work on this and making this pull request @YiminJin! I looked over the changes, and they look reasonable to me, but I am not very familiar with the elasticity code and I will need a bit more time to think about the spd factor.

Scanning over the tester result your fixes do seem to generally improve the convergence, which is great. Can you indent your code and add to commit which updates the tester results in the pull request?

MFraters avatar Feb 28 '24 15:02 MFraters

@MFraters Thanks for checking the modifications :-) But I don't understand the meaning of "add to commit which updates the tester results" and how to indent the code. I have tried "clang-format -i ", but it changes a lot of things (including the original codes).

YiminJin avatar Feb 29 '24 01:02 YiminJin

But I don't understand the meaning of "add to commit which updates the tester results"

when you run the tests locally, you will notice that a lot of test fail. You can also see that with the github testers, which are failing. That is correct in this case, because you changes changed the results of the tester (different ammount of iterations, different residuals, ets.) Here is how you can update the test results: https://aspect-documentation.readthedocs.io/en/latest/user/extending/testing/writing-tests.html. Once you applied the diff, you can make a commit out of that and push it.

and how to indent the code. I have tried "clang-format -i ", but it changes a lot of things (including the original codes).

aspect uses astyle 2.04 (that is a very specific version of astyle, not the latest) for indenting, not clang-format. You will need to install it (see https://github.com/geodynamics/aspect/wiki/Indenting-code-and-installing-the-correct-version-of-astyle for more info). Once you have done that you can run make indent or ninja indent to indent your code.

I hope this helps!

MFraters avatar Feb 29 '24 03:02 MFraters

@MFraters Thank you for the detailed illustration. I haven't started to deal with the testers an the formats though, because I'm trying to fix the problem with material averaging.

Newton method with material averaging is far more complex than I expected. I made a straightforward derivation and updated the codes accordingly. The derivation is in the following pdf: cell_averaging.pdf However, the new modifications do not improve the convergence. I have tested it with the kaus_2010 benchmark, and the results are

Code version Linear solver Cell averaging on viscosity Total wall time (s) Total nonlinear iterations
Before modification block AMG harmonic 375 183
After modification block AMG harmonic 430 252
After modification block AMG none 214 103

kaus_2010.txt The modification for averaging makes the convergence rate even worse... It is frustrating. Could you please check my derivations and codes to see if there is something wrong?

YiminJin avatar Mar 04 '24 16:03 YiminJin

I have fixed the Stokes newton assembler with material averaging. Now with block AMG solver and harmonic average, both the kaus_2010 benchmark and the gerya_2019 benchmark can converge within 10 nonlinear iterations in each time step. Please test the newest version of this PR @naliboff . It is suggestive to switch to Newton method earlier in order to get faster convergence, like the prm files below. kaus_2010.txt gerya_2019_vp.txt gerya_2019_vep.txt

To make the Newton solver achieve optimal convergence with material averaging, 3 modifications are required:

  1. apply deviator() to the strain rate in the system jacobian, and use viscoelastic strain rate instead of strain rate if elasticity is enabled;
  2. make the viscosity derivatives compatible with the material averaging, as illustrated in the following pdf file; cell_averaging.pdf
  3. if elasticity is enabled, then the elastic force must be calculated with the averaged viscosity.

Without any of the 3 modifications, the Newton method would show poor performance.

I have also made the 3 modifications in source/simulator/stokes_matrix_free.cc, mainly in function local_apply(). However, the Newton method converges even slower than before. This time I cannot hope to figure it out by myself, since I have little knowledge about the matrix-free method. Please check the modifications in source/simulator/stokes_matrix_free.cc and see if it really does the assembling as Eqs. (18), (20) and (22) in the pdf file @MFraters @gassmoeller @tjhei . If the problem cannot be solved, I will restore the original file and deal with the format and tester issues to make this PR ready for merge.

YiminJin avatar Mar 06 '24 11:03 YiminJin

Eureka! Problem solved!

It's lucky that I took a final glance at local_apply() and noticed that I misunderstood the functions get_value() and get_gradient(). Now the Newton method works for both AMG solver and GMG solver, and the convergence rate is much better than before.

I have pushed the new commits with the indentation problem settled by astyle-2.04. However, I have entered the link labelled "Details" but did not find the "changes-test-results.diff" file. Could you please tell me how to fetch this file? @MFraters @anne-glerum

YiminJin avatar Mar 07 '24 09:03 YiminJin

Eureka! Problem solved!

great, thank you. I will take a closer look...

I have pushed the new commits with the indentation problem settled by astyle-2.04

Identation is correct now, so don't worry.

tjhei avatar Mar 07 '24 13:03 tjhei

I have fixed the Stokes newton assembler with material averaging. Now with block AMG solver and harmonic average, both the kaus_2010 benchmark and the gerya_2019 benchmark can converge within 10 nonlinear iterations in each time step

@YiminJin - This is fantastic news!! I am going to give this a try with a version of the continental extension cookbook modified to included elasticity and other adjustments, which has not been able to converge past a certain stage if the plastic damper viscosity is sufficiently low.

However, I have entered the link labelled "Details" but did not find the "changes-test-results.diff" file. Could you please tell me how to fetch this file?

Here is you would do this for the 9.5-v3 tester:

  1. Click on the details button next to the name of the tester. This should take you here.
  2. Next, click on the word "Summary", which is in the upper left part of the screen above the word "Jobs" and below "Fixup newton solver and elastic rheology #6347" with the red x next to it. That should take you here.
  3. At the bottom of that page is a section titled "Artifacts", which contains the files with differences between the tester results and the test results in your PR.
  4. You can download those files with wget or manually downloading them, and they should be placed in the main aspect directory (which is checked out to your branch). Once you have reached that point, you can update the tests on your branch using git apply (ex: git apply changes-test-results-9.5.diff).
  5. Once you have applied the changes, make a new commit (usually labeled something like update tests) and then update the PR as normal.

naliboff avatar Mar 07 '24 22:03 naliboff

@naliboff Thanks a lot! The changes in test results have been applied now.

By the way, when using material averaging with the Newton method it is recommended to use "SPD" instead of "PD", because the material averaging makes the top-left block lose its symmetry.

YiminJin avatar Mar 08 '24 03:03 YiminJin

I don't understand why the tester still fails when changes-test-results-9.5.diff and changes-test-results-master.diff are vacant. Is it possible that some compile error occurred?

YiminJin avatar Mar 08 '24 09:03 YiminJin

Is it possible that some compile error occurred?

Some tests are crashing with an error message. You can go to "details" under dealii-9.5 tester: https://github.com/geodynamics/aspect/actions/runs/8200014733/job/22426044826 Then click "report test results". Then you should see a long scrolling list of test output. Some of the failures are:

*** "Test viscoelastic_stress_build-up_gmg failed": ***
[1/9] Generating output-viscoelastic_stress_build-up_gmg/screen-output
FAILED: output-viscoelastic_stress_build-up_gmg/screen-output 
cd /__w/aspect/aspect/build/tests && /__w/aspect/aspect/tests/cmake/timeout.sh 600s /__w/aspect/aspect/tests/cmake/run_test.sh 0 /__w/aspect/aspect/build/aspect /__w/aspect/aspect/build/tests/viscoelastic_stress_build-up_gmg.x.prm /__w/aspect/aspect/build/tests/output-viscoelastic_stress_build-up_gmg/screen-output.tmp && /__w/aspect/aspect/tests/cmake/default screen-output </__w/aspect/aspect/build/tests/output-viscoelastic_stress_build-up_gmg/screen-output.tmp >/__w/aspect/aspect/build/tests/output-viscoelastic_stress_build-up_gmg/screen-output
/__w/aspect/aspect/tests/cmake/run_test.sh: line 16: 55078 Aborted                 (core dumped) $BINARY $PRM > ${OUTPUT} 2>&1
-----------------------------------------------------------------------------
--                             This is ASPECT                              --
-- The Advanced Solver for Planetary Evolution, Convection, and Tectonics. --
-----------------------------------------------------------------------------
--     . version 2.6.0-pre
--     . using deal.II 9.5.1
--     .       with 32 bit indices and vectorization level 1 (128 bits)
--     . using Trilinos 13.2.0
--     . using p4est 2.3.2
--     . using Geodynamic World Builder 0.5.0
--     . running in DEBUG mode
--     . running with 1 MPI process
-----------------------------------------------------------------------------

Vectorization over 2 doubles = 128 bits (SSE2), VECTORIZATION_LEVEL=1
-----------------------------------------------------------------------------
-- For information on how to cite ASPECT, see:
--   https://aspect.geodynamics.org/citing.html?ver=2.6.0-pre&mf=1&sha=&src=code
-----------------------------------------------------------------------------
Number of active cells: 2,500 (on 1 levels)
Number of degrees of freedom: 56,207 (20,402+2,601+2,601+10,201+10,201+10,201)

*** Timestep 0:  t=0 years, dt=0 years
   Solving temperature system... 0 iterations.
   Solving ve_stress_xx system ... 0 iterations.
   Solving ve_stress_yy system ... 0 iterations.
   Skipping ve_stress_xy composition solve because RHS is zero.
   Solving Stokes system... 250+0 iterations.

   Postprocessing:
     Compositions min/max/mass: 1.999e+08/1.999e+08/1.999e+18 // -1.999e+08/-1.999e+08/-1.999e+18 // 0/0/0
     Temperature min/avg/max:   293 K, 293 K, 293 K
     RMS, max velocity:         0.0258 m/year, 0.0445 m/year
     Writing graphical output:  output-viscoelastic_stress_build-up_gmg/solution/solution-00000



+----------------------------------------------+------------+------------+
| Total wallclock time elapsed since start     |      10.2s |            |
|                                              |            |            |
| Section                          | no. calls |  wall time | % of total |
+----------------------------------+-----------+------------+------------+
| Assemble Stokes system rhs       |         1 |      0.26s |       2.6% |
| Assemble composition system      |         3 |     0.726s |       7.1% |
| Assemble temperature system      |         1 |     0.207s |         2% |
| Build Stokes preconditioner      |         1 |    0.0129s |      0.13% |
| Build composition preconditioner |         2 |     0.018s |      0.18% |
| Build temperature preconditioner |         1 |   0.00165s |         0% |
| Initialization                   |         1 |     0.142s |       1.4% |
| Postprocessing                   |         1 |     0.186s |       1.8% |
| Setup dof systems                |         1 |    0.0965s |      0.95% |
| Setup initial conditions         |         1 |     0.242s |       2.4% |
| Setup matrices                   |         1 |     0.143s |       1.4% |
| Solve Stokes system              |         1 |      8.02s |        79% |
| Solve composition system         |         2 |   0.00193s |         0% |
| Solve temperature system         |         1 |  0.000859s |         0% |
+----------------------------------+-----------+------------+------------+

-- Total wallclock time elapsed including restarts: 10s
*** Timestep 1:  t=1000 years, dt=1000 years

--------------------------------------------------------
An error occurred in line <861> of file </__w/aspect/aspect/source/material_model/interface.cc> in function
    void aspect::MaterialModel::MaterialAveraging::average(aspect::MaterialModel::MaterialAveraging::AveragingOperation, const typename dealii::DoFHandler<dim>::active_cell_iterator&, const dealii::Quadrature<dim>&, const dealii::Mapping<dim>&, aspect::MaterialModel::MaterialModelOutputs<dim>&) [with int dim = 2; typename dealii::DoFHandler<dim>::active_cell_iterator = dealii::TriaActiveIterator<dealii::DoFCellAccessor<2, 2, false> >]
The violated condition was: 
    quadrature_formula.size() == values_out.n_evaluation_points()
Additional information: 
    When asking for a Q1-type averaging operation, this function requires
    to know the locations of the evaluation points.

Stacktrace:
-----------
#0  /__w/aspect/aspect/build/aspect: void aspect::MaterialModel::MaterialAveraging::average<2>(aspect::MaterialModel::MaterialAveraging::AveragingOperation, dealii::DoFHandler<2, 2>::active_cell_iterator const&, dealii::Quadrature<2> const&, dealii::Mapping<2, 2> const&, aspect::MaterialModel::MaterialModelOutputs<2>&)
#1  /__w/aspect/aspect/build/aspect: aspect::MaterialModel::Rheology::Elasticity<2>::fill_reaction_outputs(aspect::MaterialModel::MaterialModelInputs<2> const&, std::vector<double, std::allocator<double> > const&, aspect::MaterialModel::MaterialModelOutputs<2>&) const
#2  /__w/aspect/aspect/build/aspect: aspect::MaterialModel::Viscoelastic<2>::evaluate(aspect::MaterialModel::MaterialModelInputs<2> const&, aspect::MaterialModel::MaterialModelOutputs<2>&) const
#3  /__w/aspect/aspect/build/aspect: void aspect::Simulator<2>::get_artificial_viscosity<double>(dealii::Vector<double>&, aspect::Simulator<2>::AdvectionField const&, bool) const
#4  /__w/aspect/aspect/build/aspect: aspect::Simulator<2>::assemble_advection_system(aspect::Simulator<2>::AdvectionField const&)
#5  /__w/aspect/aspect/build/aspect: aspect::Simulator<2>::assemble_and_solve_temperature(double const&, double*)
#6  /__w/aspect/aspect/build/aspect: aspect::Simulator<2>::solve_single_advection_single_stokes()
#7  /__w/aspect/aspect/build/aspect: aspect::Simulator<2>::solve_timestep()
#8  /__w/aspect/aspect/build/aspect: aspect::Simulator<2>::run()
#9  /__w/aspect/aspect/build/aspect: void run_simulator<2>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, bool, bool)
#10  /__w/aspect/aspect/build/aspect: main
--------------------------------------------------------

It seems that the assemble_advection_system() needs to supply evaluation points now.

tjhei avatar Mar 08 '24 14:03 tjhei

@YiminJin - I ran a modified version of the continental extension cookbook (elasticity, particles, other changes) with AMG + Newton solver with this branch and the main branch. The results are consistent with your findings above, in that the updated scheme typically produces convergence (to a tolerance of 1e-5) in ~ 10 nonlinear iterations, whereas running the same problem with the current main branch or #4370 will typically not produce convergence within the maximum allotted (200) number of nonlinear iterations.

The prm file and a plot of the number of nonlinear iterations required per time step is attached, but note that the results from the model using the main branch are still running.

There is one time step using this branch where it reaches the maximum number of nonlinear iterations, and an examination of that time step revealed the nonlinear residual cycling around a tolerance near ~ 2e-5 - 5e-5. So, one anomaly, but overall this is a dramatic improvement in convergence behavior for models that run for a much longer period of time!

continental_extension_nonlinear_convergence_behavior continental_extension_vep_particles.txt

naliboff avatar Mar 08 '24 20:03 naliboff

Some tests are crashing with an error message. You can go to "details" under dealii-9.5 tester:

@tjhei Thanks for the instruction. I have found the problem.

It seems that the assemble_advection_system() needs to supply evaluation points now.

This is because when doing material averaging in filling elastic outputs and reaction terms (source/material_model/rheology/elasticity.cc), I have assumed the quadrature formula for evaluating material properties is Introspection::quadratures, but it is not always the case (for instance, when computing the artificial viscosity the quadrature becomes QTrapezoid). I have not found an easy way to solve it, so I just excluded the "project to Q1 (only viscosity)" scheme from consideration. I have added a "TODO" to mark the problem.

YiminJin avatar Mar 09 '24 13:03 YiminJin

So, one anomaly, but overall this is a dramatic improvement in convergence behavior for models that run for a much longer period of time!

@naliboff Great to hear that :-) Thanks for carrying out the experiments and plotting the comparison chart.

About the one time step that reaches the upper limit of nonlinear iterations, I think the problem lies in the constitutive model. Technically, V(E)P problems with friction angle greater than dilatancy angle are all ill-posed, because the plastic flow is non-associated. My modifications only improve the convergence rate, but do nothing to the well-posedness of the problem. For that, we still need the regularization methods. If a viscous damper with constant viscosity makes the shear bands too blurred, maybe a power-law viscoplastic regularization model (10.1029/2021GC009675) helps.

YiminJin avatar Mar 09 '24 14:03 YiminJin

If someone would like to modify this PR, please notice that the part related with matrix-free method in my derivation is wrong. Just ignore it.

YiminJin avatar Mar 13 '24 04:03 YiminJin

/rebuild

gassmoeller avatar Mar 15 '24 17:03 gassmoeller

Could someone summarize the current state of this PR? I see that Timo approved it at an earlier point, but it looks like there were a few changes afterwards. Tests are passing, so is this PR in principle ready for a final look and/or merge?

gassmoeller avatar Mar 15 '24 17:03 gassmoeller

Could someone summarize the current state of this PR? I see that Timo approved it at an earlier point, but it looks like there were a few changes afterwards.

@gassmoeller Timo approved this PR last week and pointed out that there might be some unnecessary changes in function local_apply(), so I rewrote that part, and then made some more changes to deal with the indentation and the testers.

This PR has some shortcomings:

  1. It prohibits the usage of "project to Q1 (only viscosity)" and "maximum viscosity" average when the Newton method is selected;
  2. It prohibits the usage of "project to Q1 only viscosity" average when elasticity is enabled;
  3. The assembly for system Jacobian in GMG solver may be inefficient (I am not sure).

All that aside, the purpose of this PR has been completed. It greatly improves the convergence behavior of the Newton method according to John's and my experiments. If the shortcomings are currently acceptable, then I think it is ready for merge.

YiminJin avatar Mar 16 '24 05:03 YiminJin

I would want to take another look, but I am traveling right now.

Secondly, I will have one of my students test the performance implications.

But overall, I am happy to see this merged. Thank you, again!

tjhei avatar Mar 16 '24 23:03 tjhei

I'll have another look tomorrow, hope that's okay!

anne-glerum avatar Mar 18 '24 15:03 anne-glerum

Absolutely, thanks for all the feedback. I just wanted to check in to get an overview and see what / if anything still needs to be done here.

gassmoeller avatar Mar 18 '24 18:03 gassmoeller

Looks good. Thank you.

tjhei avatar Apr 08 '24 18:04 tjhei