DiffEqFlux.jl icon indicating copy to clipboard operation
DiffEqFlux.jl copied to clipboard

WIP: adds ltc layer and its example

Open manyfeatures opened this issue 4 years ago • 9 comments

PR adds Liquid Constant Time Layer #509.

Original repo1 and repo2 contain the code which is consistent with biological inspired NCP architecture.

Screenshot from 2021-04-22 18-53-17

The current implementation differs from them and corresponds to the plain description from the paper Screenshot from 2021-04-22 18-59-21

To ensure the stability the output of the network is clipped to be in range between -1 and 1

manyfeatures avatar Apr 22 '21 16:04 manyfeatures

Looks great but I won't be able to review until tomorrow

ChrisRackauckas avatar Apr 22 '21 16:04 ChrisRackauckas

Couple of points, I am still going through it and will get back with more informed feedback later.

  1. I can see that in the python repo they are doing the evolution of the ode with the loop. I am wondering if we can instead use the solve for it by defining the diff eq mentioned in eq 1 directly. My interpretation is the LTC should look similar to the NeuralODE and be something like
struct LTC{M,P,RE,T,TA,AB,A,K} <: NeuralDELayer
    model::M
    p::P
    re::RE
    tspan::T
    τ::TA
    A::AB
    args::A
    kwargs::K

where user gives you the NN as the model and we internally create the ODEProblem with is as (n is the LTC layer below)

dudt_(u,p,t) = -(1/n.τ + n.re(p)(u)) * u +  n.re(p)(u)*n.A
ff = ODEFunction{false}(dudt_,tgrad=basic_tgrad)
prob = ODEProblem{false}(ff,x,getfield(n,:tspan),p)
  1. The current example you added doesn't seem to have any benefit of using LTC and instead actually gives much higher loss for the same number of epochs (400 that you have used) as compared to just a dense layer.
##LTC
epoch = 400
loss_(data_x, data_y) = 0.0020240898579002888

## changing m to m = Chain(Dense(2,32, tanh), Dense(32,1,x->x)) in the example
epoch = 400
loss_(data_x, data_y) = 1.0341962f-5

I think we can use this as the test, but for an example we would want to recreate of the experiments from the paper, probably MNIST since we have some existing code for it that should help it get up and running.

Vaibhavdixit02 avatar Apr 25 '21 05:04 Vaibhavdixit02

Agree 100% on @Vaibhavdixit02 's points.

ChrisRackauckas avatar Apr 26 '21 12:04 ChrisRackauckas

@manyfeatures just checking if this is something you are still working on?

Vaibhavdixit02 avatar May 07 '21 16:05 Vaibhavdixit02

@Vaibhavdixit02 Yep, I'll try to remaster the PR in a couple of days

manyfeatures avatar May 07 '21 17:05 manyfeatures

Great 👍

Vaibhavdixit02 avatar May 07 '21 17:05 Vaibhavdixit02

I've got some problem because can't limit the hidden state amplitude if I transform the task in NeuralODE layer. I'll describe the case on discourse

manyfeatures avatar May 21 '21 14:05 manyfeatures

@manyfeatures you would also want to take a look at https://github.com/lungd/LTC.jl btw

Vaibhavdixit02 avatar May 21 '21 14:05 Vaibhavdixit02

I created the post with the code and also will examine it lately as well

manyfeatures avatar May 21 '21 18:05 manyfeatures