DiffEqFlux.jl
DiffEqFlux.jl copied to clipboard
WIP: adds ltc layer and its example
PR adds Liquid Constant Time Layer #509.
Original repo1 and repo2 contain the code which is consistent with biological inspired NCP architecture.

The current implementation differs from them and corresponds to the plain description from the paper

To ensure the stability the output of the network is clipped to be in range between -1 and 1
Looks great but I won't be able to review until tomorrow
Couple of points, I am still going through it and will get back with more informed feedback later.
- I can see that in the python repo they are doing the evolution of the ode with the loop. I am wondering if we can instead use the
solvefor it by defining the diff eq mentioned in eq 1 directly. My interpretation is theLTCshould look similar to theNeuralODEand be something like
struct LTC{M,P,RE,T,TA,AB,A,K} <: NeuralDELayer
model::M
p::P
re::RE
tspan::T
τ::TA
A::AB
args::A
kwargs::K
where user gives you the NN as the model and we internally create the ODEProblem with is as (n is the LTC layer below)
dudt_(u,p,t) = -(1/n.τ + n.re(p)(u)) * u + n.re(p)(u)*n.A
ff = ODEFunction{false}(dudt_,tgrad=basic_tgrad)
prob = ODEProblem{false}(ff,x,getfield(n,:tspan),p)
- The current example you added doesn't seem to have any benefit of using LTC and instead actually gives much higher loss for the same number of epochs (400 that you have used) as compared to just a dense layer.
##LTC
epoch = 400
loss_(data_x, data_y) = 0.0020240898579002888
## changing m to m = Chain(Dense(2,32, tanh), Dense(32,1,x->x)) in the example
epoch = 400
loss_(data_x, data_y) = 1.0341962f-5
I think we can use this as the test, but for an example we would want to recreate of the experiments from the paper, probably MNIST since we have some existing code for it that should help it get up and running.
Agree 100% on @Vaibhavdixit02 's points.
@manyfeatures just checking if this is something you are still working on?
@Vaibhavdixit02 Yep, I'll try to remaster the PR in a couple of days
Great 👍
I've got some problem because can't limit the hidden state amplitude if I transform the task in NeuralODE layer. I'll describe the case on discourse
@manyfeatures you would also want to take a look at https://github.com/lungd/LTC.jl btw
I created the post with the code and also will examine it lately as well