diffrax icon indicating copy to clipboard operation
diffrax copied to clipboard

SRK

Open andyElking opened this issue 2 years ago • 19 comments

This PR adds all the shiny new SRK methods, while leaving out the Langevin-only methods (ALIGN and SORT). A few things to keep in mind:

  • I wrote some comments in test/helpers.py, test_integrate.py, and _solvers/srk.py that are there for you to read and will be removed later. They are mostly questions about somewhat shaky parts of my code that I want to hear your opinions on.
  • I included langevin.ipynb in the examples folder. Because we are not merging in the LangevinTerm in _terms.py, I just copied the code for that into the notebook. If you want, I can remove this whole notebook until the next PR, but I do think it is very informative as to how the new stochastic solvers work. Furthermore, it gives you a showcase of what is to come later, which might make it clearer why I made some design decisions, as they make adding the Langevin things more seamless later on.
  • I added the new solvers to test_integrate.test_sde_strong_order. There I added a new "additive noise" mode in addition to the "commutative" and "non-commutative" noise options. Later on, once we add LangevinTerm, I also have a new set of tests called test_langevin.py prepared, which further test these methods (although they are primarily aimed and ALIGN and SORT, as those don't work with non-Langevin SDEs).
  • Apart from SRA1, I currently don't have any of Rossler's other solvers. Adding these will amount to adding just a few tableaus, which can easily be done later down the line (i.e. after you review the existing code). To be fair these aren't really my priority right now, but I will try to add them before we merge this whole PR.
  • I will probably be quite busy over the next two weeks, so I might not be able to make many edits at this point. I'm mostly making this PR now to give you ample time to read all the code 😊.

andyElking avatar Dec 25 '23 12:12 andyElking

Whoops, looks like GitHub auto-closed this due to merging the dev branch.

If you can rebase on top of main then I'd be happy to review this now. :) (Getting this in is my next priority.)

patrick-kidger avatar Jan 08 '24 22:01 patrick-kidger

No worries, I rebased it :)

andyElking avatar Jan 08 '24 22:01 andyElking

I think there is an error with the SRK notebook, I see these when I look at it on the branch or in the diff Screenshot 2024-01-25 at 2 08 32 PM Screenshot 2024-01-25 at 2 08 48 PM

lockwo avatar Jan 25 '24 21:01 lockwo

Thanks for letting me know! Seems the file got corrupted. It was perfectly fine in my local branch, just had to force push it again.

On Thu, 25 Jan 2024 at 21:09, Owen Lockwood @.***> wrote:

I think there is an error with the SRK notebook, I see these when I look at it on the branch or in the diff Screenshot.2024-01-25.at.2.08.32.PM.png (view on web) https://github.com/patrick-kidger/diffrax/assets/42878312/7c5159a6-ef43-4951-b43a-04e242a8101a Screenshot.2024-01-25.at.2.08.48.PM.png (view on web) https://github.com/patrick-kidger/diffrax/assets/42878312/81e8d9e7-e717-4a25-85f6-432e642ef737

— Reply to this email directly, view it on GitHub https://github.com/patrick-kidger/diffrax/pull/344#issuecomment-1911000919, or unsubscribe https://github.com/notifications/unsubscribe-auth/APY2OSR77AMUVDRBAZVROHDYQLCYJAVCNFSM6AAAAABBCKVBMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGAYDAOJRHE . You are receiving this because you authored the thread.Message ID: @.***>

andyElking avatar Jan 25 '24 21:01 andyElking

Awesome, thanks for that! Seems like a local comment also got left on ;) image

Excited to see these methods make it to main!

lockwo avatar Jan 26 '24 05:01 lockwo

Thanks, yes, I'll take care of that 😊. There are a few more things that need to be modified anyway - I'll do that once Patrick is done with his review.

On Fri, 26 Jan 2024, 5:52 am Owen Lockwood, @.***> wrote:

Awesome, thanks for that! Seems like a local comment also got left on ;) image.png (view on web) https://github.com/patrick-kidger/diffrax/assets/42878312/1b618032-d6e1-4587-8c60-169c86e92768

Excited to see these methods make it to main!

— Reply to this email directly, view it on GitHub https://github.com/patrick-kidger/diffrax/pull/344#issuecomment-1911524244, or unsubscribe https://github.com/notifications/unsubscribe-auth/APY2OSVTFMPINYM4WCS23P3YQNACLAVCNFSM6AAAAABBCKVBMOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGUZDIMRUGQ . You are receiving this because you authored the thread.Message ID: @.***>

andyElking avatar Jan 26 '24 07:01 andyElking

Thanks for these comments, I made the appropriate corrections.

Overall, my impression is that this looks pretty good! It's also really highlighting how we really need a way to say something about the control in the term (see also #359), so that we can write something like

term_structure: MultiTerm[tuple[ODETerm, AbstractTerm[AbstractBrownianMotion[SpaceTimeLevyArea]]]]

and try to avoid the current heuristic Levy area checks happening in AbstractSRK.init.

Yes, I will think about that. So I don't think the type of Levy area generated is important enough to make its way into the type annotation, and could be left at the level of an attribute.

Probably the least invasive way would be to generally just determine compatibility with the control based on the PyTree signature of it output (e.g. LevyVal), like I do in srk.py:

if sttla:
     assert bm_inc.K is not None

If the control doesn't have the right shape we throw a well-explained error. In either case it is up to the user to make sure the BM has the correct shape (to match the diffusion VF), so maybe Levy area doesn't need any more special treatment than that.

I'm not really sure what is "nice enough" though. You are certainly much more of an expert on how to write good code.

andyElking avatar Jan 30 '24 13:01 andyElking

Also, regarding langevin.ipynb: The reason I import private things is that I moved the LangevinTerm from _term.py into that notebook, just to give you some context on what is coming later. I intend to remove the notebook from this PR before it gets merged, but I can remove it now already if you wish.

andyElking avatar Jan 30 '24 13:01 andyElking

(let me know when you next want my input on this PR btw)

patrick-kidger avatar Feb 10 '24 19:02 patrick-kidger

(let me know when you next want my input on this PR btw)

So I made the changes you suggested (except eqxi.scan) on the pr_correction branch. I also made some changes to test_sde_strong_order there, because they were quite sorely needed. I explained why in the code comments. But I'll probably merge all of that into this branch once I figure out the eqxi.scan, which might be on Monday or Tuesday. I'll let you know.

andyElking avatar Feb 10 '24 23:02 andyElking

Hi Patrick, I made the edits you suggested.

In addition I also reformed convergence testing. This includes comparing the values of solutions at several times (not only at t1) as well as addressing the concern about Euler and Heun being way too imprecise to be used in establishing the order of high-order solvers. I explain more in this comment: https://github.com/andyElking/diffrax_STLA/blob/c70e33bf0a18bd19ac7a3b0ed7074c6bafccd79b/test/test_integrate.py#L229-L233

andyElking avatar Feb 12 '24 16:02 andyElking

@patrick-kidger I haven't managed to address all your comments yet, I'll do the rest tomorrow. But I decided to push it as is so you can have a look at the things I did manage to fix (I hope...).

SRK example

Regarding the SRK example: I think we'd need to consider whether to include this in the docs or not! It's quite wordy, and focuses on convergence rates etc. I tend to reserve the Examples section of the docs for things that are useful to most new users. Maybe we should put it in the devdocs at the bottom?

Yes, I'm happy to put it wherever. I can leave just a short use-case in the examples and the rest into devdocs. But I do want to keep it around somewhere. If nothing else, Extropic people said they liked those notebooks as a guidance for which solvers are used for what and how. Let me know where exactly I should put it.

Tests

This aside, I'm quite antsy about big the changes are to the tests here. This is an easy way for bugs to creep in. For simply testing the order of the new solvers I feel like we should be able to use the existing infrastructure. If Euler/Heun aren't sufficient then maybe we can pick problems with analytical solutions, or just use even smaller step sizes in the reference solutions?

I'm not sure problems with analytical solutions are necessarily a general-enough SDE. I can ask James, though. Even-smaller steps are also not an option: Weaker solvers only get into their convergence regime for dt<2**-3. We want at least a factor of 32 between largest and smallest datapoints, so we go down to 2**-8 (or lower). But when we add SORT, which is 3rd order, if we want Heun (as ref_solver) to be more precise than SORT is at dt=2**-8, we need ref_dt=2**-18 which is undoable on my laptop (with an RTX 4080 mobile). This means we'd need to adapt the test in hacky ways for high order solvers, which is not sustainable.

I think my proposed approach is much better. There are two properties we are testing here:

  • if the solver converges to its own limit with the right order
  • whether the limit of the solver is the same as that of Euler/Heun.

By separating these, there is no longer a need for such a painfully finely discretised reference solution. I don't think I have a better idea for how to write these tests in a clean way.

Brownian return type inheritance

I prefer SpaceTimeLevyArea inherits from BrownianIncrement (and STTLA inherits from STLA), and we can get rid of AbstractBrownianReturn. This gives a natural hierarchical structure, eases the checks of minimal_levy_area, lets me get rid of an example of # pyright: ignore​, and makes type annotations easier altogether.

andyElking avatar Feb 20 '24 22:02 andyElking

Okay, @patrick-kidger I made the rebase and all the necessary fixes now, let me know what you think. Also see my previous comment for some explanation.

andyElking avatar Feb 21 '24 18:02 andyElking

Let's put the SRK example in the devdocs!

Tests: hmm, okay, I defer to your experience on this! In that case what I'll say is let's do these as separate tests where required. (And it sounds like this is only needed for a few of the SRK methods?) That is, let's not edit any of the existing tests (except in minor ways if it makes sense to factor out something small). That'll give me a lot greater surety that things are behaving as they should do!

Brownian return type inheritance: I'm going to disagree with you on this one :) I have a strong preference for "concrete means final", i.e. that you cannot subclass a concrete class. (Or if you prefer, every class admits precisely one of being instantiated, or being subclassed.) In the broader context of Equinox I've got a guide on this here. As it happens, in this individual case, I agree that it would be easier to read if we ignored this rule! However, the I believe the codebase as a whole is easier to read if we know that this pattern is always followed.

patrick-kidger avatar Feb 22 '24 13:02 patrick-kidger

Let's put the SRK example in the devdocs!

Will do.

Tests: hmm, okay, I defer to your experience on this! In that case what I'll say is let's do these as separate tests where required. (And it sounds like this is only needed for a few of the SRK methods?) That is, let's not edit any of the existing tests (except in minor ways if it makes sense to factor out something small). That'll give me a lot greater surety that things are behaving as they should do!

You're right, I'll separate out the new tests, it's safer that way. Still, I think the old tests are a bit flawed and not extensible (they only check the terminal value and rely on the assumption that the reference solution is a lot more precise than what any future solver will produce), so eventually you should consider swapping them for something new.

Brownian return type inheritance: I'm going to disagree with you on this one :) I have a strong preference for "concrete means final", i.e. that you cannot subclass a concrete class. (Or if you prefer, every class admits precisely one of being instantiated, or being subclassed.) In the broader context of Equinox I've got a guide on this here. As it happens, in this individual case, I agree that it would be easier to read if we ignored this rule! However, the I believe the codebase as a whole is easier to read if we know that this pattern is always followed.

Given that having this inheritance seems very natural, makes the code a lot cleaner and avoids several pyright: ignore tags, we should really figure out a way to do something equivalent. What do you suggest?

andyElking avatar Feb 22 '24 13:02 andyElking

@lockwo I fixed the issue you raised and added a test that checks whether SDE solvers work correctly with a PyTree-shaped SDE. Let me know if everything works in your use-case.

@patrick-kidger I reverted back to the old test_sde_strong_order and separated the new tests. I also moved srk_example to devdocs.

Regarding both of the "class struggles" we talked about earlier, I still don't think I can satisfy your stylistic preferences while keeping a natural structure and clean code. But I would welcome a more concrete suggestion how to do this in a way that doesn't compromise the rest of my code.

andyElking avatar Feb 22 '24 19:02 andyElking

@patrick-kidger It keeps saying the tests are getting cancelled (even though nothing failed). What could be the reason for this?

andyElking avatar Feb 22 '24 22:02 andyElking

I separated the new sde tests into test_sde.py and now they all passed. Is it perhaps that you can't have too many tests in one file? Maybe a memory issue?

andyElking avatar Feb 24 '24 15:02 andyElking

@lockwo I fixed the issue you raised and added a test that checks whether SDE solvers work correctly with a PyTree-shaped SDE. Let me know if everything works in your use-case.

@patrick-kidger I reverted back to the old test_sde_strong_order and separated the new tests. I also moved srk_example to devdocs.

Regarding both of the "class struggles" we talked about earlier, I still don't think I can satisfy your stylistic preferences while keeping a natural structure and clean code. But I would welcome a more concrete suggestion how to do this in a way that doesn't compromise the rest of my code.

I can confirm that my pytree based code now works, thank you

lockwo avatar Feb 26 '24 21:02 lockwo

@patrick-kidger you mentioned you wanted to get this done before 1st March. I did what you suggested as soon as I could (apart from the class inheritance bits, where I need more guidance on how to proceed). When do you think you can do your next review?

andyElking avatar Feb 28 '24 05:02 andyElking

Heyhey! Sorry for the radio silence, been rather hectic. Okay, so, on class struggles: this is an important point so I'll dedicate a whole, frankly rather long, post to this. This is basically me geeking out about type theory for a bit.

My proposal

First of all, my proposal:

class AbstractBrownianReturn(eqx.Module):
    dt: AbstractVar[PyTree[Array]]
    W: AbstractVar[PyTree[Array]]

class AbstractSpaceTimeLevyArea(AbstractBrownianReturn):
    H: AbstractVar[PyTree[Array]]

class AbstractSpaceTimeTimeLevyArea(AbstractSpaceTimeLevyArea):
    K: AbstractVar[PyTree[Array]]

class BrownianReturn(AbstractBrowianReturn):
    dt: PyTree[Array]
    W: PyTree[Array]

class SpaceTimeLevyArea(AbstractSpaceTimeLevyArea):
    dt: PyTree[Array]
    W: PyTree[Array]
    H: PyTree[Array]

class SpaceTimeTimeLevyArea(AbstractSpaceTimeTimeLevyArea):
    dt: PyTree[Array]
    W: PyTree[Array]
    H: PyTree[Array]
    K:PyTree[Array]

You're probably looking at that and thinking I've gone mad! That looks extra-complicated with loads of code duplication.

The rationale here is quite careful, so let me explain.

Why have both an AbstractFoo and a ConcreteFoo?

This reflects a careful distinction: abstract types are ones which can be subclassed. Concrete types are ones which can be instantiated.

This means that when you see a function of the form

def some_fn(x: ConcreteFoo):
    ...

then you know with 100% certainty exactly what type x will be at runtime. It cannot be subclassed. You do not need to worry about all of the ways its API might be implemented by a subclass, and this frees you from whole classes of bugs.

This particular point is an idea taken from Julia, and for this I very much like Stefan Karpinksi's discussion on this:

While there are a number of practical reasons to disallow subtyping of concrete types, I think they all stem from the basic fact that subtyping a concrete type is logically unsound. A type can only be instantiated if it is completely specified, and it only makes sense to subtype something if it is incompletely specified. Thus, a type should either be abstract or concrete but not both. What object-oriented languages that allow the same type to be both instantiated and subtyped are really doing is using the same name for two different things: a concrete type that can be instantiated and an abstract type that can be subtyped. Problems ensue since there's no way to express which you mean. You end up with attempts to resolve this confusion like the "final" keyword and recommendations against ever subtyping classes that weren't intended to be abstract. But wouldn't it be better not to conflate the two things in the first place?

Source + interesting further discussion: see here. Do a bit of a search on the Julia forms in general and you'll see other interesting discussions on this topic, it really is a great way to implement a type system for a language.

Why have both an AbstractVar[Bar] and a Bar?

Okay, perhaps I've convinced you that it's worth having the distinction between abstract and concrete.

You might now ask if we can simplify things to this:

class AbstractBrownianReturn(eqx.Module):
    dt: PyTree[Array]
    W: PyTree[Array]

class AbstractSpaceTimeLevyArea(AbstractBrownianReturn):
    H: PyTree[Array]

class AbstractSpaceTimeTimeLevyArea(AbstractSpaceTimeLevyArea):
    K: PyTree[Array]

class BrownianReturn(AbstractBrowianReturn):
    pass

class SpaceTimeLevyArea(AbstractSpaceTimeLevyArea):
    pass

class SpaceTimeTimeLevyArea(AbstractSpaceTimeTimeLevyArea):
    pass

However, there are two reasons against this. The first is that when I look at SpaceTimeTimeLevyArea, I have no idea what its __init__ method looks like. I have to traverse a whole type hierarchy to find the signature of a single method, in this case that it is (dt: PyTree[Array], W: PyTree[Array], H: PyTree[Array], K: PyTree[Array]). This is bad -- navigating interactions across complicated type hierarchies is big part of why OOP isn't really enjoyed any more! Let's not do that again.

The second is that when you write something like

class AbstractRectangle(eqx.Module):
    height: AbstractVar[int]
    width: AbstractVar[int]

    @abstractmethod
    def paint_colour(self, colour: str): ...

then the intention is to define a contract that the API will meet, e.g. so that this is valid:

def some_fn(rect: AbstractRectangle):
    area = rect.height * rect.width
    rect.paint_colour("purple")

You can see that I've included an abstractmethod here to try and emphasise how we can have both abstract attributes and abstract methods, and that both are things we may wish to define in an API contract.

However, when you write something like

class Square(AbstractRectangle):
    side: int

    @property
    def height(self) -> int:
        return self.side

    @property
    def width(self) -> int:
        return self.side

    def paint_colour(self, colour: str): # some implementation here

then you can see there is an important distinction between an attribute (an API contract that dot-lookup rect.height is valid) and a field (what is actually stored in memory).

These are two different things! We do not want to muddle these things up.

In practice the rule we use in Equinox is that precisely one type in a hierarchy is allowed to define fields. (Almost always the concrete type at the bottom of the hierarchy.) These will (a) define the __init__ method, and (b) define what is stored in memory. We could maybe find other ways of doing this, but this happens to be what we do today.

Discussion

You might claim that my proposal above is not very readable! And frankly, you would be right. It is much harder to read than the simpler thing you have right now.

However, I'm a big believer in consistency: the above design principles are applied uniformly across the whole Equinox ecosystem. We trade off local readability here for the ability to navigate through the rest of the codebase with greater surety. In fact, I think what we have here is more-or-less the absolute worst-case-scenario for this design principle; each class and field is duplicated twice and there's nothing else going on. And it's still... not that bad? Whilst these design principles give us a way to handle all the complexities we've tackled elsewhere.

By the way, this idea that I describe here, of turning a concrete inheritance hierarchy:

ConcreteA
 \
  ConcreteB
   \
    ConcreteC

into an abstract one:

          AbstractA
         / \
ConcreteA   AbstractB
           / \
  ConcreteB   AbstractC
             /
    ConcreteC

is a trick worth knowing, as it comes up every now and again.

Phew, that was a long post! Thank you for staying with me. As you can see, I'm a big believer in getting the types right, I think it's a really important part of the long-term readability and maintainability of a whole codebase.

patrick-kidger avatar Mar 02 '24 11:03 patrick-kidger

Wow! Thank you so much for this detailed explanation, Patrick!! I appreciate it a lot! I understand what you mean and you certainly convinced me this is the way to go. I just wasn't really sure what you wanted from me last time (and frankly I was quite tired from the skiing 😅), so I maybe came off as a bit impatient, sorry. Thanks for this very good lesson 😊!!

I will read everything again more closely later today and make the appropriate changes.

andyElking avatar Mar 02 '24 12:03 andyElking

I noticed the LD example got deleted here (https://github.com/patrick-kidger/diffrax/pull/344/commits/a6d0c3d8f5d6e1a86ebc46dcb4bef627a938cae8#diff-3b8783ddcbe486f44c31d4f05eb072aeae0e174785b5e649f75226673c65b3fc). Is there a place something like that could live on diffrax? I found it helpful, maybe there could be a whole section for "further examples" or something?

lockwo avatar Mar 04 '24 23:03 lockwo

@lockwo the Langevin stuff is coming in a later PR, and langevin.ipynb will be in devdocs then.

You can still find it on the main branch (in examples) or on the devel branch (in notebooks). Sorry the devel branch is a bit of a mess - I've been doing some experiments for an upcoming paper and had to make some hacky workarounds.

andyElking avatar Mar 04 '24 23:03 andyElking

Also @patrick-kidger when you have the time, the code is ready for the next review. I did my best regarding class inheritance, hope you like it.

The only thing I'm unsure about is in VBT where we had if self.levy_area=="space-time" now I use if issubclass(self.levy_area, AbstractSpaceTimeLevyArea), which relies on ordering the if clauses so the STLA clause comes before the BM clause (and in the future STTLA will have to be before STLA). If, however, I used if self.levy_area is AbstractSpaceTimeLevyArea, that could be confusing for users since setting levy_area=SpaceTimeLevyArea would result in an error. What do you think is a more desirable behaviour?

In addition, I dealt with the deprecation of import jax.config in conftest.py.

andyElking avatar Mar 04 '24 23:03 andyElking

Okay! The great news is that I think we're just about there. I've left some very twiddly little nits all over, but that's it. I'm basically happy with this implementation.

That's great to hear! Thank you for all your help with this, I really appreciate it, and it is a great learning experience!

By the way, do you want to add the solvers to _autocitation.py? This will mean that we can automatically generate bibtex files for these solvers too.

I fixed all your other comments and I'll try to deal with the autocitation tomorrow.

andyElking avatar Mar 21 '24 21:03 andyElking

Are any of the new solvers compatible with weakly controlled terms? I've been using them a lot for certain problems where my noise is a constant so I don't want the full matmul, but when I tried them with ShARK I see it isn't compatible.

ValueError: `terms` must be a PyTree of `AbstractTerms` (such as `ODETerm`), with structure diffrax._term.MultiTerm[tuple[diffrax._term.ODETerm, diffrax._term.ControlTerm[typing.Any, diffrax._custom_types.AbstractSpaceTimeLevyArea]]]

(if I change it to a normal Control term it works fine)

lockwo avatar Apr 02 '24 00:04 lockwo

Or is this related to Patricks PR which does "I realised that it's not sufficient to have term_compatible_contr_kwargs have a single thing on the solver -- we need something with essentially the same tree structure as term_structure, so that we can map the kwargs to each term. So I reworked this."

lockwo avatar Apr 02 '24 00:04 lockwo

Can confirm, it runs off Patricks branch

lockwo avatar Apr 02 '24 01:04 lockwo

Can confirm, it runs off Patricks branch

Hi @lockwo!

Yes indeed any descendant of AbstractSRK should work with any _AbstractControlTerm(AbstractTerm[_VF, _Control]) as long as term.control is a descendant of AbstractBrownianPath. And yes, the issue is just that we are trying to integrate SRKs with the new term structure checks from #364 and it seems I haven't done that correctly. Sorry about this.

I'll soon merge in Patrick's branch, so it won't stay broken for long.

andyElking avatar Apr 02 '24 09:04 andyElking