Linus

Results 37 comments of Linus
trafficstars

Executing the MWE on SciMLSensitvitiy#master now yields this: ``` julia> println(Zygote.gradient(ps -> loss(ps), ps_)) ERROR: Scalar indexing is disallowed. Invocation of getindex resulted in scalar indexing of a GPU array....

Same error on CPU: ``` julia> println(Zygote.gradient(ps -> loss(ps), ps_)) ERROR: BoundsError: attempt to access 12×1 ComponentMatrix{Float32, Matrix{Float32}, Tuple{Axis{(ps_drift = ViewAxis(1:6, Axis(weight = ViewAxis(1:4, ShapedAxis((2, 2), NamedTuple())), bias = ViewAxis(5:6,...

I'm pretty sure I can solve this, it seems like `stack` on a `ComponentVector` isn't behaving as expected

> Zygote has a bug here that's easy to workaround: > > ```julia > using Lux, Zygote, DifferentialEquations, ComponentArrays, Random, SciMLSensitivity, LinearAlgebra > > p = [1.5, 1.0, 3.0, 1.0]...

> Zygote has a bug here What is the bug exactly? Can we fix it? This workaround is O(n^2)

hmmm BUT we need to perturb this correctly with the noise. I can't get behind how torchsde / diffrax are doing this right now...

In `torchsde`, they never actually define `g` in the adjoint SDE, only define `g_prod` which is the product between `g` and the noise. So compare the implementation for EulerHeun: ```...

update: there's a trivial solution!! > ``` > # how to multiply tmp2 with dW such that dgrad * dW == tmp2 (*) dW? > # how to multiply tmp1...

this of course gives us quite a performance boost: ``` m = 10000 (...) function f(u, p, t) [p[1] * x - p[2] * t + p[3] * p[4] *...