DiffEqFlux.jl icon indicating copy to clipboard operation
DiffEqFlux.jl copied to clipboard

Collocation is not timescale independent

Open JTaets opened this issue 3 years ago • 3 comments

All the collocation methods only work when the timescale is correct. This is easily fixed by artificially "normalising" the time-scale. But when the data has multiple dimensions, with multiple orders of size, this method will also fail. I think the code of the kernels need to be adapted to do something like this automatically for each row in the data.

tsteps = 0:0.01:1
data = reshape( sin.(15*tsteps),1,:)

# bad
du,u = collocate_data(data,tsteps,EpanechnikovKernel())
scatter(tsteps,data')
plot!(tsteps,u',lw=5) 

# good
scale = 10 # higher scale -> better fit
du,u = collocate_data(data,scale*tsteps,EpanechnikovKernel()) # scaled tsteps
du.*= scale # scaled du
scatter(tsteps,data')
plot!(tsteps,u',lw=5) 

JTaets avatar Feb 11 '22 12:02 JTaets

I agree with you. Can you open a PR for this?

ChrisRackauckas avatar Mar 27 '22 22:03 ChrisRackauckas

Apologies for the late response.

The problem is what the value of the parameter scale should be. Taking a high value will let you have good fits, but too high will let us fit to the noise.

I see 3 solutions:

  1. Let the user pass the scale, maybe something like EpanechnikovKernel(scale). A good choice of scale can be easily checked by a plot like i did above.
  2. Automatically detection of the scale. But I don't know how easy this is with noise.
  3. Implement signal filters such that the user needs to filter the noise out him/herself (DSP.jl?), then do a fit with high scale.

My preference is for 3, since this is way more tangible.

JTaets avatar May 19 '22 15:05 JTaets

We should probably do 1 and 3, but 1 is low hanging fruit.

ChrisRackauckas avatar Jun 11 '22 05:06 ChrisRackauckas

Scaling the timeseries does work but it can easily break:

tsteps = 0:0.1:1
data = reshape( sin.(15*tsteps),1,:)
scale = 10
du,u = collocate_data(data,scale*tsteps,EpanechnikovKernel())

Error:

ERROR: LinearAlgebra.SingularException(2)
Stacktrace:
  [1] ldiv!(B::Matrix{Float64}, D::LinearAlgebra.Diagonal{Float64, Vector{Float64}}, A::LinearAlgebra.Adjoint{Float64, Matrix{Float64}})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/diagonal.jl:425
  [2] \
    @ ~/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/diagonal.jl:412 [inlined]
  [3] \(A::Matrix{Float64}, B::LinearAlgebra.Adjoint{Float64, Matrix{Float64}})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.9.3+0.x64.linux.gnu/share/julia/stdlib/v1.9/LinearAlgebra/src/generic.jl:1107
  [4] (::DiffEqFlux.var"#179#180"{Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, EpanechnikovKernel, Matrix{Float64}, Matrix{Float64}, Matrix{Float64}, Float64, Vector{Float64}, Vector{Float64}})(_t::Float64)
    @ DiffEqFlux ~/.julia/packages/DiffEqFlux/5uK1j/src/collocation.jl:149
  [5] iterate
    @ ./generator.jl:47 [inlined]
  [6] _collect(c::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, itr::Base.Generator{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, DiffEqFlux.var"#179#180"{Matrix{Float64}, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, EpanechnikovKernel, Matrix{Float64}, Matrix{Float64}, Matrix{Float64}, Float64, Vector{Float64}, Vector{Float64}}}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base ./array.jl:802
  [7] collect_similar
    @ ./array.jl:711 [inlined]
  [8] map
    @ ./abstractarray.jl:3263 [inlined]
  [9] collocate_data(data::Matrix{Float64}, tpoints::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, kernel::EpanechnikovKernel)
    @ DiffEqFlux ~/.julia/packages/DiffEqFlux/5uK1j/src/collocation.jl:142

This happens when for any point, the kernel weights becomes zero for the neighbouring points. This happens when the time points are far from each other (which happens when they are scaled) as the support for kernel function is between 1 and -1.

sathvikbhagavan avatar Sep 01 '23 15:09 sathvikbhagavan

@ChrisRackauckas can this issue be closed as https://github.com/SciML/DiffEqFlux.jl/pull/853 addresses bandwidth being passed by the user or should we also implement point 3 mentioned above: "Implement signal filters such that the user needs to filter the noise out him/herself (DSP.jl?), then do a fit with high scale."?

sathvikbhagavan avatar Sep 30 '23 04:09 sathvikbhagavan