OrdinaryDiffEq.jl icon indicating copy to clipboard operation
OrdinaryDiffEq.jl copied to clipboard

More Runge-Kutta Methods for (Hyperbolic) PDEs

Open ranocha opened this issue 6 years ago • 29 comments

Here are some additional articles describing/investigating/developing Runge-Kutta methods, especially for hyperbolic PDEs:

These methods do not necessarily have high-priority but may be implemented. There are also many other methods.

ranocha avatar Aug 29 '18 17:08 ranocha

Thanks! Since we have good infrustructure for Runge-Kutta methods I think it's best to leave this as an introductory issue (this is a good "first algorithm" for potential GSoC students!)

ChrisRackauckas avatar Aug 29 '18 20:08 ChrisRackauckas

There's this one as well: https://arxiv.org/abs/1806.08693

ChrisRackauckas avatar Aug 30 '18 14:08 ChrisRackauckas

There's this one as well: https://arxiv.org/abs/1806.08693

I've added it to the list.

ranocha avatar Aug 30 '18 14:08 ranocha

And a new one: https://arxiv.org/abs/1809.04807

ChrisRackauckas avatar Sep 26 '18 21:09 ChrisRackauckas

And a new one: https://arxiv.org/abs/1809.04807

Thanks. I've added that as well to the list.

ranocha avatar Sep 27 '18 03:09 ranocha

I have tried to implement the first method in the list given above in #581. I'd like to know if it is alright.

dgan181 avatar Jan 01 '19 12:01 dgan181

Thanks, @dgan181. You implemented the method SSPRK53_2N1 of Higueras and Roldan as SSPRK53_2N, didn't you? It may be good to add also the other scheme SSPRK53_2N2 (eq. (60) of Higueras and Roldan).

It would be good to add the new citation for SSPRK53_2N (or SSPRK53_2N1 and SSPRK53_2N2 if both are added) to https://github.com/JuliaDiffEq/juliadiffeq.github.io/blob/master/citing.md.

ranocha avatar Jan 02 '19 08:01 ranocha

@ranocha I tried implementing the second method here #584 . Once that is resolved, I will add the new citations.

dgan181 avatar Jan 02 '19 18:01 dgan181

I'm implementing RK46-NL from Berland, Bogey, Bailly (2006)

arnav-t avatar Jan 15 '19 17:01 arnav-t

Merging the discussion of #415. @dgan181 is doing Conde, Fekete, Shadid (2018)

ChrisRackauckas avatar Jan 15 '19 21:01 ChrisRackauckas

Hello everyone, i'm a newbie, trying to implement Mead, Renaut (1999)

keshavseksaria avatar Jan 16 '19 21:01 keshavseksaria

I guess I will switch to Stanescu, Habashi (1998) now as @saurabhkgp21 did RK46-NL

arnav-t avatar Jan 17 '19 05:01 arnav-t

I am working on Parsani, Ketcheson, Deconinck (2013)

stroebel avatar Feb 01 '19 16:02 stroebel

@ranocha These papers Kubatko, Yeager, Ketcheson (2014) and Kennedy, Carpenter, Lewis (2000) have 21 and >16 schemes respectively. We might not want to implement all of them. But I'm not able to decide which one to implement and which ones I should skip.

deeepeshthakur avatar Feb 05 '19 16:02 deeepeshthakur

For Kennedy, Carpenter, Lewis (2000), the method RK4(3)5[2R+]C (using their nomenclature) should be a good starting point. Since the low-storage schemes described there can be implemented similarly, it might be worth the effort to write some kind of "generic" version of the methods perform_step! etc. such that only the low-storage Runge-Kutta coefficients have to be supplied for a new scheme.

ranocha avatar Feb 06 '19 05:02 ranocha

I don't have a really good advice concerning Kubatko, Yeager, Ketcheson (2014). The present several schemes for different applications (order of the DG method the schemes are optimised for). It might seem that these schemes are not as popular in the DG community as others because of their higher storage requirements.

ranocha avatar Feb 06 '19 06:02 ranocha

For Kennedy, Carpenter, Lewis (2000), the method RK4(3)5[2R+]C (using their nomenclature) should be a good starting point. Since the low-storage schemes described there can be implemented similarly, it might be worth the effort to write some kind of "generic" version of the methods perform_step! etc. such that only the low-storage Runge-Kutta coefficients have to be supplied for a new scheme.

@ChrisRackauckas @ranocha Yes. This can be implemented quite easily coefficients can be stored as SVector in ConstantCache instead of being separate variables. The integrator argument will remain exactly the same and the Cache argument can be changed to the union of all methods using that perform_step. But won't this be a drastic change in how the other methods are implemented?

deeepeshthakur avatar Feb 06 '19 07:02 deeepeshthakur

I don't have a really good advice concerning Kubatko, Yeager, Ketcheson (2014). The present several schemes for different applications (order of the DG method the schemes are optimised for). It might seem that these schemes are not as popular in the DG community as others because of their higher storage requirements.

So which ones should we implement?

deeepeshthakur avatar Feb 06 '19 07:02 deeepeshthakur

If we follow the creators of Julia, we should implement all methods because "we are greedy". I think Chris mentioned sometime ago that he likes the idea of having many published time integration schemes collected in the DiffEq ecosystem. Thus, all schemes might be implemented in order to be able to test and compare them easily.

Whether or not the "change in how the other methods are implemented" is acceptable might be debatable. Most RK methods have specialised implementations because there are optimisation possibilities specific for each scheme. However, the kinds of low-storage schemes I know leave much less (or no) further optimisation possibilities, at least as far as I know. Thus, it might be worth having a general implementation for them. If we see further optimisation possibilities for a specific scheme, we can still write specialised methods.

@ChrisRackauckas should answer these questions and decide what to do, I think.

ranocha avatar Feb 06 '19 07:02 ranocha

Whether or not the "change in how the other methods are implemented" is acceptable might be debatable. Most RK methods have specialised implementations because there are optimisation possibilities specific for each scheme. However, the kinds of low-storage schemes I know leave much less (or no) urther optimisation possibilities, at least as far as I know. Thus, it might be worth having a general implementation for them. If we see further optimisation possibilities for a specific scheme, we can still write specialised methods.

The problem is a complicated mixture of history and GPU interactions. Essentially, when I first created the RK methods, Julia didn't have StaticArrays.jl. I tested what happened if you replaced the Vectors of coefficients (defined in https://github.com/JuliaDiffEq/DiffEqDevTools.jl/blob/master/src/ode_tableaus.jl ) over to constant coefficients, and saw a pretty good speedup, so I implemented them like that. All of the first RK methods had completely different structures, so that's fine. And you don't want to condense the structure too much because you want to conserve single kernel broadcasting. What I mean by that is:

@. tmp = dt*a12*k2 + dt*a13*k3

is much faster than

@. tmp = dt*a12*k2 
@. tmp += dt*a13*k3

when GPUs or MPI or things like that are involved, since the former is a single kernel and the latter is two, so it reduces the parallelism overhead. So the standard RK methods, with their random zeros and different error estimators, are difficult to put together in a way that preserves these features, so they stayed written out.

But not all methods have. If you look at the Rodas4 perform step ( https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/perform_step/rosenbrock_perform_step.jl#L800-L946 ) you can see that multiple methods use that same perform step definition (https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/rosenbrock_caches.jl#L305-L558) since there are no optimizations that can be done when separating them. These cache constructions can probably condensed ever more too, since the only difference is the coefficients they use.

So yes, the low-storage methods of the same type should be condensed. In fact, the low-storage methods kind of have a stepping archetype to them that can be exploited to put a lot of them together. I wasn't worrying about that for people just starting since I'd rather people learn how to write a method and then move onto their project rather than get frustrated with writing the best most generic low-storage setup ever, but someone needs to do this. Using static array coefficients and keeping the kernels full sized should have absolutely no impact on the runtime of the low-storage methods, so there's definitely some cleanup in order.

But you can see what what you do need is enough structure contained in the method's implementation to keep the benefits. If you just go "oh they are RK methods", that's too far because now you don't have a way to specialize the cache sizes. So @ranocha is right, "we are greedy". In fact, here we accept that OrdinaryDiffEq.jl is a library, so while aliasing a few arrays might make analysis more difficult ( https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/low_order_rk_caches.jl#L446-L459 ), people are using our library expecting it to be performant so we need to make these optimizations. The reason why we specialize is because many times it's simply easier to get a performant method if you write each one out.

The way to think of library development in Julia is a two step process:

  1. Get it right (working, tested)
  2. Get it good (benchmarked, simplified, performant)

Trying to do (2) without doing (1) just slows you down. Many times it's good to just accept (1) and let everyone update it along the way later! In fact, doing this as two separate PRs is usually a great way to work.

ChrisRackauckas avatar Feb 06 '19 12:02 ChrisRackauckas

I wasn't worrying about that for people just starting since I'd rather people learn how to write a method and then move onto their project rather than get frustrated with writing the best most generic low-storage setup ever, but someone needs to do this. Using static array coefficients and keeping the kernels full sized should have absolutely no impact on the runtime of the low-storage methods, so there's definitely some cleanup in order.

I support that approach completely. While implementing one or two low-storage schemes is a good way to learn Julia, the DiffEq ecosystem, and more about Runge-Kutta methods, implementing lots of them can be really annoying. Thus, I've started some more "general" implementation in #648.

ranocha avatar Feb 06 '19 12:02 ranocha

Mead, Renaut (1999) is being worked on in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/pull/1052 Stanescu, Habashi (1998) is being worked on in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/pull/1051

ChrisRackauckas avatar Mar 06 '20 04:03 ChrisRackauckas

@ranocha @ChrisRackauckas I tried to implement SSPRK53_H from the first paper, with 3N registers, in https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/pull/1055

Thank You !

Biswajitghosh98 avatar Mar 06 '20 08:03 Biswajitghosh98

Implemented https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/pull/1051 .

kamalojasv181 avatar Mar 07 '20 12:03 kamalojasv181

@ChrisRackauckas can I work on Stanescu, Habashi (1998) two step method ?

kamalojasv181 avatar Mar 09 '20 17:03 kamalojasv181

Hi, I would like to give it a try to implement one of the methods. Which one of the remaining is the easiest, where I can follow an existing pattern and learn the basics that way?

mtsokol avatar Mar 10 '20 23:03 mtsokol

Ketcheson's is a good one to try. Follow the devdocs: https://devdocs.juliadiffeq.org/latest/contributing/adding_algorithms/

ChrisRackauckas avatar Mar 11 '20 04:03 ChrisRackauckas

Hello I am implementing KYK SSPRK52

ErikQQY avatar Mar 23 '21 05:03 ErikQQY