NeuralPDE.jl icon indicating copy to clipboard operation
NeuralPDE.jl copied to clipboard

QuasiRandomTraining(sampling_alg = SobolSample()) is not working correctly

Open YichengDWu opened this issue 1 year ago • 18 comments

julia> s = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, LatinHypercubeSample())
2×300 Matrix{Float64}:
 -1.08     -1.33333  -1.86     -1.49333  …  -1.54667  -1.66667  -1.47333
  1.35667   1.53667   1.84333   1.07333      1.28      1.66667   1.24667

julia> scatter(s[1,:],s[2,:])
image
julia> s2 = NeuralPDE.generate_quasi_random_points(300, [[-1.0,-2.0],[1.0,2.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 -1.00586  -1.50586  -1.75586  …  -1.2373  -1.7373  -1.9873  -1.4873
  1.00586   1.50586   1.75586      1.2373   1.7373   1.9873   1.4873

julia> scatter(s2[1,:],s2[2,:])
image

YichengDWu avatar Sep 27 '22 03:09 YichengDWu

What if the call is directly to QuasiMonteCarlo.jl? Can this be isolated without NeuralPDE.jl?

ChrisRackauckas avatar Sep 27 '22 05:09 ChrisRackauckas

just SobolSample

YichengDWu avatar Sep 27 '22 05:09 YichengDWu

julia> s3 = QuasiMonteCarlo.sample(300, [-3.0,1.0],[-2,2], SobolSample())
2×300 Matrix{Float64}:
 -2.99414  -2.49414  -2.24414  …  -2.7627  -2.2627  -2.0127  -2.5127
  1.49805   1.99805   1.24805      1.3291   1.8291   1.0791   1.5791

julia> scatter(s3[1,:],s3[2,:])
image

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

It is in NeuralPDE

julia> s3 =  NeuralPDE.generate_quasi_random_points(300, [[0.0,1.0],[0.0,1.0]], Float64, SobolSample())
2×300 Matrix{Float64}:
 0.00585938  0.505859  0.755859  0.255859  …  0.737305  0.987305  0.487305
 0.00585938  0.505859  0.755859  0.255859     0.737305  0.987305  0.487305

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

🤔 how... alright. I haven't looked at the sampling code in NeuralPDE.jl but like the other things I refactored, the answer is probably just to aggressively simplify and delete a bunch of cruft.

ChrisRackauckas avatar Sep 27 '22 06:09 ChrisRackauckas

There is no randomness in the Sobol sequence?

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

julia> QuasiMonteCarlo.sample(10, [0.0], [1.0], SobolSample())
1×10 Matrix{Float64}:
 0.1875  0.6875  0.9375  0.4375  0.3125  …  0.5625  0.0625  0.09375  0.59375

NeuralPDE just samples along each dimension and vcat them.

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

julia> a = QuasiMonteCarlo.sample(20, [0.0], [1.0], SobolSample())
1×20 Matrix{Float64}:
 0.09375  0.59375  0.84375  0.34375  0.46875  …  0.546875  0.796875  0.296875

julia> b = QuasiMonteCarlo.sample(20, [1.0], [2.0], SobolSample())
1×20 Matrix{Float64}:
 1.09375  1.59375  1.84375  1.34375  …  1.04688  1.54688  1.79688  1.29688

julia> a .+1 .== b
1×20 BitMatrix:
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

NeuralPDE just samples along each dimension and vcat them.

wtf? You definitely cannot do that for high dimensional samplers. The whole point is how they handle the high dimensional space as not just a tensor product.

There is no randomness in the Sobol sequence?

It should be quasi-random. Check direct calls to https://github.com/stevengj/Sobol.jl (QuasiMonteCarlo.jl is just a wrapper over that)

ChrisRackauckas avatar Sep 27 '22 06:09 ChrisRackauckas

It looks like so in https://github.com/SciML/NeuralPDE.jl/blob/master/src/training_strategies.jl#L168.

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

Yup, both more complicated than just directly sampling, and incorrect. Good find, thank you. You handling this or should I?

ChrisRackauckas avatar Sep 27 '22 06:09 ChrisRackauckas

This doesn't look too hard, I'll try to fix it over the weekend.

YichengDWu avatar Sep 27 '22 06:09 YichengDWu

It should be quasi-random.

QuasiMonteCarlo.jl is probably incorrect either. A new SobolSeq is created inside sample and that kills randomness.

https://github.com/SciML/QuasiMonteCarlo.jl/blob/master/src/QuasiMonteCarlo.jl#L170

YichengDWu avatar Sep 28 '22 21:09 YichengDWu

The Sobol sequence is a quasi-random series, not a pseudo-random series, that does not have any randomness in its construction. The only thing you could randomize for it is how many digits i from the start you start from, but I'm not sure its convergence is guaranteed either if you don't start at the start of it, so I believe each integration would need to start from 1 and thus it would be a deterministic low-discrepancy sequence.

Latin Hypercubes in contract do have a randomness in their construction because there are (finitely) many different Latin Hypercubes one can construct to tile n dimensions. So we should make sure that one showcases the correct randomness, while the Sobol sequence does not necessarily.

ChrisRackauckas avatar Sep 29 '22 06:09 ChrisRackauckas

It doesn't make any difference whether SobolSeq is reconstructed if only sampled once. But for resampling, I think it should iterate from the end of the last sequence. Although this is an odd setup to me, the data is resampled on each call to the loss function: https://github.com/SciML/NeuralPDE.jl/blob/master/src/training_strategies.jl#L222.

YichengDWu avatar Sep 29 '22 06:09 YichengDWu

resampling=true and resampling=false are the same for SobolSample()

YichengDWu avatar Sep 29 '22 06:09 YichengDWu

But for resampling, I think it should iterate from the end of the last sequence.

Reference on whether that gives a convergent integration? There are high dimensional sequences like sparse grids which are deterministic tilings that require that you do not cut the start to get a convergent integration. I don't know if Sobol has that property or not.

ChrisRackauckas avatar Sep 29 '22 06:09 ChrisRackauckas

I'm reading https://arxiv.org/pdf/2207.10289.pdf. I'm just saying that resampling needs to generate different points, and that seems to be good for pinns in the paper.

YichengDWu avatar Sep 29 '22 07:09 YichengDWu

Yes, resampling is required for a PINN to get a good loss sampling, otherwise things overfit. However, that doesn't actually answer the question. The QMC methods are supposed to be convergent integration schemes: is starting from n a convergent integration scheme on Sobol? For sparse grids, the answer is no. I am not sure if that's the case for Sobol. If it's not, then it's not a good sampling of the loss function. If it is, then having a mutable counter inside of SobolSampling that updates a starting location would be good idea.

Also, I wouldn't take the LHS result too far. It doesn't seem to use the same optimization https://mrurq.github.io/LatinHypercubeSampling.jl/stable/man/lhcoptim/ vs https://scikit-optimize.github.io/stable/modules/generated/skopt.sampler.Lhs.html .

ChrisRackauckas avatar Sep 29 '22 07:09 ChrisRackauckas