DiffEqFlux.jl
DiffEqFlux.jl copied to clipboard
Problems in using TensorLayer as a hidden layer.
Really appreciate these packages in SciML. I am now a crazy fan now. I meet some questions while using TensorLayer as a hidden layer.
Q(1) The first problem is about broadcasting.
It seems that broadcast doesn't work in TensorLayer
using DiffEqFlux, DifferentialEquations, LinearAlgebra
B = [PolynomialBasis(2)]
nn0 = Dense(1,1)
nn1 = TensorLayer(B, 1)
nn2 = FastDense(1,1)
u = [1 2 3]
println(nn0(u)) # Float32[0.95563 1.91126 2.86689] 3 Dimension
println(nn1(u)) # [-0.9934653167126005] ??? only ONE dimension ???
println(nn2(u,initial_params(nn2))) # Float32[-0.7425173 -1.4850346 -2.227552]
Due to loss of broadcast, TensorLayer can even not use as a hidden layer as the following way.
nn = TensorLayer([SinBasis(2),CosBasis(2)],1)
nn_1 = Chain(Dense(1,2),nn)
nn_2 = Chain(Dense(1,2),Dense(2,1))
u = [1.0 2.0 3.0]
println(nn_1(u)) # [-0.02888989486945616] ??? only ONE dimension ???
println(nn_2(u)) # [-1.239922964687894 -2.479845929375788 -3.719768894063682]
Q(2) How to train the model with hidden TensorLayer?
I finish the following code but I think it is strange. Moreover the loss end at 7916, and doesn't change. My problem is
- how to purify the following code,
- and how to train dnn with hidden Tensorlayer. Choose specific optimizer like ways in SinDy (Sparse Identification of Nonlinear Dynamics with Control).
# 1. Gennerate Data from sin(x)
x = Array(-2π:0.1:2π)'
data = sin.(x)
# 2. learn by dnn with a hidden TensorLayer
function dnn_output(θ,v)
tensor_layer = TensorLayer([SinBasis(2)], 1)
dnn_model = FastChain(FastDense(1,1),(y,p)-> tensor_layer(y,θ[1:2])[1], FastDense(1, 1))
value = []
for i = v
push!(value,dnn_model(i,θ[3:end])[1])
end
Array(value)
end
function train(θ)
dnn_output(θ,x)
end
p = randn(6)
println(train(p)) # output a vector.
function loss(θ)
pred = train(θ)
sum(abs2, (data .- pred))
end
println(loss(p)) # 29790
res = DiffEqFlux.sciml_train(loss, p, ADAM(0.05), maxiters = 2000)
println(loss(res.minimizer)) # 7916 won't change
Q(1) is a bug. Yeah that needs to get fixed. It needs to broadcast its operation along the rows.
For Q(2), that might just be a local minimum?
Really excited for my idol's response. I feel very happy. Last question is that
is TensorLayer as a hidden layer a good way to couple SinDy and Universal differential equation?
Somewhat. It is learning lower dimensional universal approximators. The issue with this approach is that it can be somewhat more prone to local minima though, especially with orthogonal basis functions.
Somewhat. It is learning lower dimensional universal approximators. The issue with this approach is that it can be somewhat more prone to local minima though, especially with orthogonal basis functions.
Thank you. I keep exploring the problem. Appreciate your works.
This got fixed.