OperatorLearning.jl
OperatorLearning.jl copied to clipboard
Fourier Layer Tests
I have added a couple of simple tests for Fourier Layer and DeepONet Layer. What more tests can we add for the same? One thing that I wanted to add for test/deeponet.jl
model1 = DeepONet((16, 22, 30), (1, 16, 24, 30), σ, tanh; init_branch=Flux.glorot_normal, bias_trunk=false)
parameters = params(model1)
branch = Chain(Dense(16, 22,init=Flux.glorot_normal), Dense(22, 30,init=Flux.glorot_normal))
trunk = Chain(Dense(1, 16, bias=false), Dense(16, 24, bias=false), Dense(24, 30, bias=false))
model2 = DeepONet(branch, trunk)
model1(a,sensors)
model2(a,sensors)
#forward pass
@test model1(a, sensors) ≈ model2(a, sensors)
m1grad = Flux.Zygote.gradient((x,p)->sum(model1(x,p)),a,sensors)
m2grad = Flux.Zygote.gradient((x,p)->sum(model2(x,p)),a,sensors)
#gradients
@test !iszero(m1grad)
@test !iszero(m2grad)
@test m1grad[1] ≈ m2grad[1] rtol=1e-12
@test m1grad[2] ≈ m2grad[2] rtol=1e-12
but the problem is, making the parameters same for model1 and model2 doesn't seem feasible here. Besides, I wanted to know how to formulate a test for training of FNO and DeepONet?
For the training test, let's start with regression testing. The tutorial has examples using them to solve some equations. Do that on a PDE with a known analytical solution and do a difference against the analytical solution, putting the tolerance just above the error of what you get locally. That would then trigger if the training ever gets worse. Usually taking the current loss and multiplying it by like 3 or 5 is a safe regression value.
Would we need a dataset in the library for that?
Maybe we can add a smaller version of the Burgers' equation dataset which contains just what we need for the test, because the whole data file is like 600MB
It has data for 2048 initial conditions at 8192 points for each😅. We can use like 100 - 200 ICs at 1024 points(just like the tutorial), would that work?
So I tried implementing a training test for fourier layer, I think there could be a bug here, I have followed the burgers' equation example here Code:
vars = matread("burgerset.mat")
xtrain = vars["a"][1:280, :]
xtest = vars["a"][end-19:end, :]
ytrain = vars["u"][1:280, :]
ytest = vars["u"][end-19:end, :]
grid = collect(range(0, 1, length=length(xtrain[1,:])))
xtrain = cat(reshape(xtrain,(280,1024,1)),
reshape(repeat(grid,280),(280,1024,1));
dims=3)
ytrain = cat(reshape(ytrain,(280,1024,1)),
reshape(repeat(grid,280),(280,1024,1));
dims=3)
xtest = cat(reshape(xtest,(20,1024,1)),
reshape(repeat(grid,20),(20,1024,1));
dims=3)
ytest = cat(reshape(ytest,(20,1024,1)),
reshape(repeat(grid,20),(20,1024,1));
dims=3)
xtrain, xtest = permutedims(xtrain,(3,2,1)), permutedims(xtest,(3,2,1))
ytrain, ytest = permutedims(ytrain,(3,2,1)), permutedims(ytest,(3,2,1))
train_loader = Flux.Data.DataLoader((xtrain, ytrain), batchsize=20, shuffle=true)
test_loader = Flux.Data.DataLoader((xtest, ytest), batchsize=20, shuffle=false)
layer = FourierLayer(128,128,1024,16,gelu,bias_fourier=false)
model = Chain(Dense(2,128;bias=false), layer, layer, layer, layer,
Dense(128,2;bias=false))
learning_rate = 0.001
opt = ADAM(learning_rate)
parameters = params(model)
loss(x,y) = Flux.Losses.mse(model(x),y)
evalcb() = @show(loss(xtest,ytest))
throttled_cb = Flux.throttle(evalcb, 5)
Flux.@epochs 500 Flux.train!(loss, parameters, train_loader, opt, cb = throttled_cb)
Error:
MethodError: no method matching batched_gemm(::Char, ::Char, ::Array{ComplexF64, 3}, ::Array{ComplexF32, 3})
Closest candidates are:
batched_gemm(::AbstractChar, ::AbstractChar, ::AbstractArray{ComplexF64, 3},
!Matched::AbstractArray{ComplexF64, 3}) at C:\Users\user\.julia\packages\BatchedRoutines\4RDBA\src\blas.jl:137
batched_gemm(::AbstractChar, ::AbstractChar, !Matched::ComplexF32, ::AbstractArray{ComplexF32, 3},
!Matched::AbstractArray{ComplexF32, 3}) at C:\Users\user\.julia\packages\BatchedRoutines\4RDBA\src\blas.jl:134
batched_gemm(::AbstractChar, ::AbstractChar, !Matched::AbstractArray{ComplexF32, 3},
::AbstractArray{ComplexF32, 3}) at C:\Users\user\.julia\packages\BatchedRoutines\4RDBA\src\blas.jl:137
... in eval at base\boot.jl:373 in top-level scope at Juno\n6wyj\src\progress.jl:119 in macro expansion at [Flux\qAdFM\src\optimise\train.jl:144] (https://github.com/SciML/OperatorLearning.jl/pull/35#) in at Flux\qAdFM\src\optimise\train.jl:105 in var"#train!#36" at Flux\qAdFM\src\optimise\train.jl:107 in macro expansion at Juno\n6wyj\src\progress.jl:119 in macro expansion at [Flux\qAdFM\src\optimise\train.jl:109] (https://github.com/SciML/OperatorLearning.jl/pull/35#) in gradient at Zygote\FPUm3\src\compiler\interface.jl:75 in pullback at Zygote\FPUm3\src\compiler\interface.jl:352 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at Flux\qAdFM\src\optimise\train.jl:110 in _pullback at ZygoteRules\AIbCs\src\adjoint.jl:65 in adjoint at Zygote\FPUm3\src\lib\lib.jl:200 in _apply at base\boot.jl:814 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at fourier_tests.jl:135 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at Flux\qAdFM\src\layers\basic.jl:49 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at Flux\qAdFM\src\layers\basic.jl:47 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at Flux\qAdFM\src\layers\basic.jl:47 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at [dev\OperatorLearning\src\FourierLayer.jl:115] (https://github.com/SciML/OperatorLearning.jl/pull/35#) in _pullback at Zygote\FPUm3\src\compiler\interface2.jl in _pullback at OMEinsum\EMISk\src\interfaces.jl:204 in _pullback at Zygote\FPUm3\src\compiler\interface2.jl:9 in macro expansion at [Zygote\FPUm3\src\compiler\interface2.jl] (https://github.com/SciML/OperatorLearning.jl/pull/35#) in chain_rrule at [Zygote\FPUm3\src\compiler\chainrules.jl:216] (https://github.com/SciML/OperatorLearning.jl/pull/35#) in rrule at ChainRulesCore\uxrij\src\rules.jl:134 in rrule at OMEinsum\EMISk\src\autodiff.jl:33 in einsum at OMEinsum\EMISk\src\interfaces.jl:200 in einsum at OMEinsum\EMISk\src\binaryrules.jl:98 in einsum at OMEinsum\EMISk\src\binaryrules.jl:226 in _batched_gemm at OMEinsum\EMISk\src\utils.jl:119
Both xtrain and grid are Float64 here. When I make them Float32 explicitly this error resolves and the model trains normally. I did the same to avoid it here
https://github.com/Abhishek-1Bhatt/OperatorLearning.jl/blob/c92d3ed1eca77ea61b756864bded99e6f42dc878/test/fourierlayer.jl#L34-L35
The test for DeepONet works fine
Then split out the DeepONet tests so those can merge quicker while the other ones are investigated.
Codecov Report
Merging #35 (c92d3ed) into master (9b16e02) will increase coverage by
16.43%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #35 +/- ##
===========================================
+ Coverage 41.09% 57.53% +16.43%
===========================================
Files 6 6
Lines 73 73
===========================================
+ Hits 30 42 +12
+ Misses 43 31 -12
Impacted Files | Coverage Δ | |
---|---|---|
src/DeepONet.jl | 60.00% <0.00%> (+20.00%) |
:arrow_up: |
src/FourierLayer.jl | 74.19% <0.00%> (+29.03%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 9b16e02...c92d3ed. Read the comment docs.
Yeah, somehow FourierLayer doesn't promote its parameters data type to match the inputs although it should. I haven't found the culprit for sure but the no. 1 suspects are the tensor multiplications:
https://github.com/SciML/OperatorLearning.jl/blob/9b16e02b68a4bc8bb7a82098cbab8bc10e50a02d/src/FourierLayer.jl#L107
https://github.com/SciML/OperatorLearning.jl/blob/9b16e02b68a4bc8bb7a82098cbab8bc10e50a02d/src/FourierLayer.jl#L115
I'm working on switching those implementations out for more specialized code anyways in #31, but the problem might well be elsewhere - that's just my best guess so far.
There's one last thing, for reading the data from .mat file would we need MAT.jl as one of the dependencies. Do I run add MAT
when OperatorLearning env is activated to do add it to Project.toml?
There's one last thing, for reading the data from .mat file would we need MAT.jl as one of the dependencies. Do I run
add MAT
when OperatorLearning env is activated to do add it to Project.toml?
Yep. However, I wouldn't include MAT
as a package dependency, only for testing. We can either put it as a test-specific dependency in the main Project.toml
, but will be deprecated in the future as described here. I would rather have a completely separate environment for tests by creating test/Project.toml
where MAT
is included, as advised in the linked docs above.