GlobalSensitivity.jl icon indicating copy to clipboard operation
GlobalSensitivity.jl copied to clipboard

Handling NaNs for divergent trajectories in Sobol Method

Open duodenum96 opened this issue 1 year ago • 8 comments

Is your feature request related to a problem? Please describe.

Following from https://discourse.julialang.org/t/handling-divergent-trajectories-in-global-sensitivity-analysis/123186 I am having the following problem in a large parameter space. Let’s say I have two parameters A and B. If I constrain the range of A between 0 and 10 and B between 0 and 5, the sensitivity to A turns out higher. If I do the inverse, sensitivity to B is higher. If I make both of them between 0 and 10, then the solutions to differential equation diverge. I tried putting NaNs to divergent solutions but this impairs the subsequent calculations of sensitivity analysis since the software is not suited for dealing with NaN values (it returns NaNs when there is a NaN in the input).

Describe the solution you’d like

The solution is probably dropping NaNs.

Describe alternatives you’ve considered

So far I was unable to come up with another alternative.

Additional context

I gave it a try and made some changes in the sobol_sensitivity.jl module. The strategy I used is inspired by the conservative option in statsmodels.tsa.stattools.acf function in python statsmodels package (https://www.statsmodels.org/dev/_modules/statsmodels/tsa/stattools/_stattools.html#acf). The strategy is the following. Take the code

fA = all_y[((i - 1) * n + 1):(i * n)]

and rewrite it as

nan_mask = isnan.(all_y)
nonnan_mask = .!nan_mask
nonnan_idx = findall(nonnan_mask)

indices_fA = ((i - 1) * n + 1):(i * n)
if dropnan
    indices_fA = intersect(indices_fA, nonnan_idx)
end
fA = all_y[indices_fA]

When I rewrite gsa_sobol_all_y_analysis function with this strategy, the test code in sobol_method.jl passes all tests with no problem. Additionally, I wrote some test code that generates NaNs and it seems the algorithm doesn't return NaNs, instead it returns actual sensitivity values. I copy-pasted the gsa_sobol_all_y_analysis and my little test code below. But I honestly don't know if this makes sense or the results I get are correct results.

Below is a rewrite of gsa_sobol_all_y_analysis function. Note that I added a dropnan = false option.

function gsa_sobol_all_y_analysis(method, all_y::AbstractArray{T}, d, n, Ei_estimator,
        y_size, ::Val{multioutput}; dropnan = false) where {T, multioutput}
    if dropnan
        nan_mask = isnan.(all_y)
        nonnan_mask = .!nan_mask
        nonnan_idx = findall(nonnan_mask)
        if sum(nonnan_mask) == 0
            throw(ArgumentError("All values are NaN"))
        end
    end

    nboot = method.nboot
    Eys = multioutput ? Matrix{T}[] : T[]
    Varys = multioutput ? Matrix{T}[] : T[]
    Vᵢs = multioutput ? Matrix{T}[] : Vector{T}[]
    Vᵢⱼs = multioutput ? Array{T, 3}[] : Matrix{T}[]
    Eᵢs = multioutput ? Matrix{T}[] : Vector{T}[]
    step = 2 in method.order ? 2 * d + 2 : d + 2
    if !multioutput
        for i in 1:step:(step * nboot)
            indices = ((i - 1) * n + 1):((i + 1) * n)
            if dropnan
                indices = intersect(indices, nonnan_idx)
            end
            push!(Eys, mean(all_y[indices]))
            push!(Varys, var(all_y[indices]))

            indices_fA = ((i - 1) * n + 1):(i * n)
            indices_fB = (i * n + 1):((i + 1) * n)
            if dropnan
                indices_fA = intersect(indices_fA, nonnan_idx)
                indices_fB = intersect(indices_fB, nonnan_idx)
            end

            fA = all_y[indices_fA]
            fB = all_y[indices_fB]
            fAⁱ = []
            for j in (i + 1):(i + d)
                indices_fAⁱ = (j * n + 1):((j + 1) * n)
                if dropnan
                    indices_fAⁱ = intersect(indices_fAⁱ, nonnan_idx)
                end
                push!(fAⁱ, all_y[indices_fAⁱ])
            end
            
            if 2 in method.order
                fBⁱ = []
                for j in (i + d + 1):(i + 2 * d)
                    indices_fBⁱ = (j * n + 1):((j + 1) * n)
                    if dropnan
                        indices_fBⁱ = intersect(indices_fBⁱ, nonnan_idx)
                    end
                    push!(fBⁱ, all_y[indices_fBⁱ])
                end
            end

            push!(Vᵢs, [sum(fB .* (fAⁱ[k] .- fA)) for k in 1:d] ./ n)
            if 2 in method.order
                M = zeros(T, d, d)
                for k in 1:d
                    for j in (k + 1):d
                        M[k, j] = sum((fBⁱ[k] .* fAⁱ[j]) .- (fA .* fB)) / n
                    end
                end
                push!(Vᵢⱼs, M)
            end
            if Ei_estimator === :Homma1996
                push!(Eᵢs,
                    [sum((fA .- (sum(fA) ./ n)) .^ 2) ./ (n - 1) .-
                     sum(fA .* fAⁱ[k]) ./ (n) + (sum(fA) ./ n) .^ 2 for k in 1:d])
            elseif Ei_estimator === :Sobol2007
                push!(Eᵢs, [sum(fA .* (fA .- fAⁱ[k])) for k in 1:d] ./ (n))
            elseif Ei_estimator === :Jansen1999
                push!(Eᵢs, [sum(abs2, fA - fAⁱ[k]) for k in 1:d] ./ (2n))
            elseif Ei_estimator === :Janon2014
                push!(Eᵢs,
                    [(sum(fA .^ 2 + fAⁱ[k] .^ 2) ./ (2n) .-
                      (sum(fA + fAⁱ[k]) ./ (2n)) .^ 2) * (1.0 .-
                      (1 / n .* sum(fA .* fAⁱ[k])
                       .-
                       (1 / n .* sum((fA .+ fAⁱ[k]) ./ 2)) .^ 2) ./
                      (1 / n .* sum((fA .^ 2 .+ fAⁱ[k] .^ 2) ./ 2) -
                       (1 / n .* sum((fA .+ fAⁱ[k]) ./ 2)) .^ 2))
                     for k in 1:d])
            end
        end
    else # multioutput case
        for i in 1:step:(step * nboot)
            indices = ((i - 1) * n + 1):((i + 1) * n)
            if dropnan
                indices = intersect(indices, nonnan_idx)
            end

            push!(Eys, mean(all_y[:, indices], dims = 2))
            push!(Varys, var(all_y[:, indices], dims = 2))

            indices_fA = ((i - 1) * n + 1):(i * n)
            indices_fB = (i * n + 1):((i + 1) * n)

            if dropnan
                indices_fA = intersect(indices_fA, nonnan_idx)
                indices_fB = intersect(indices_fB, nonnan_idx)
            end

            fA = all_y[:, indices_fA]
            fB = all_y[:, indices_fB]
            fAⁱ = []
            for j in (i + 1):(i + d)
                indices = (j * n + 1):((j + 1) * n)
                if dropnan
                    indices = intersect(indices, nonnan_idx)
                end
                push!(fAⁱ, all_y[:, indices])
            end

            if 2 in method.order
                fBⁱ = []
                for j in (i + d + 1):(i + 2 * d)
                    indices = (j * n + 1):((j + 1) * n)
                    if dropnan
                        indices = intersect(indices, nonnan_idx)
                    end
                    push!(fBⁱ, all_y[:, indices])
                end
            end

            push!(Vᵢs,
                reduce(hcat, [sum(fB .* (fAⁱ[k] .- fA), dims = 2) for k in 1:d] ./ n))

            if 2 in method.order
                M = zeros(T, d, d, length(Eys[1]))
                for k in 1:d
                    for j in (k + 1):d
                        Vₖⱼ = sum((fBⁱ[k] .* fAⁱ[j]) .- (fA .* fB), dims = 2) / n
                        for l in 1:length(Eys[1])
                            M[k, j, l] = Vₖⱼ[l]
                        end
                    end
                end
                push!(Vᵢⱼs, M)
            end
            if Ei_estimator === :Homma1996
                push!(Eᵢs,
                    reduce(hcat,
                        [sum((fA .- (sum(fA, dims = 2) ./ n)) .^ 2, dims = 2) ./ (n - 1) .-
                         sum(fA .* fAⁱ[k], dims = 2) ./ (n) + (sum(fA, dims = 2) ./ n) .^ 2
                         for k in 1:d]))
            elseif Ei_estimator === :Sobol2007
                push!(Eᵢs,
                    reduce(hcat,
                        [sum(fA .* (fA .- fAⁱ[k]), dims = 2) for k in 1:d] ./ (n)))
            elseif Ei_estimator === :Jansen1999
                push!(Eᵢs,
                    reduce(hcat, [sum(abs2, fA - fAⁱ[k], dims = 2) for k in 1:d] ./ (2n)))
            elseif Ei_estimator === :Janon2014
                push!(Eᵢs,
                    reduce(hcat,
                        [(sum(fA .^ 2 + fAⁱ[k] .^ 2, dims = 2) ./ (2n) .-
                          (sum(fA + fAⁱ[k], dims = 2) ./ (2n)) .^ 2) .* (1.0 .-
                          (1 / n .* sum(fA .* fAⁱ[k], dims = 2) .-
                           (1 / n * sum((fA .+ fAⁱ[k]) ./ 2, dims = 2)) .^ 2) ./
                          (1 / n .* sum((fA .^ 2 .+ fAⁱ[k] .^ 2) ./ 2, dims = 2) .-
                           (1 / n * sum((fA .+ fAⁱ[k]) ./ 2, dims = 2)) .^ 2)) for k in 1:d]))
            end
        end
    end
    if 2 in method.order
        Sᵢⱼs = similar(Vᵢⱼs)
        for i in 1:nboot
            if !multioutput
                M = zeros(T, d, d)
                for k in 1:d
                    for j in (k + 1):d
                        M[k, j] = Vᵢs[i][k] + Vᵢs[i][j]
                    end
                end
                Sᵢⱼs[i] = (Vᵢⱼs[i] - M) ./ Varys[i]
            else
                M = zeros(T, d, d, length(Eys[1]))
                for l in 1:length(Eys[1])
                    for k in 1:d
                        for j in (k + 1):d
                            M[k, j, l] = Vᵢs[i][l, k] + Vᵢs[i][l, j]
                        end
                    end
                end
                Sᵢⱼs[i] = cat(
                    [(Vᵢⱼs[i][:, :, l] - M[:, :, l]) ./ Varys[i][l]
                     for l in 1:length(Eys[1])]...;
                    dims = 3)
            end
        end
    end

    Sᵢs = [Vᵢs[i] ./ Varys[i] for i in 1:nboot]
    Tᵢs = [Eᵢs[i] ./ Varys[i] for i in 1:nboot]
    if nboot > 1
        size_ = size(Sᵢs[1])
        S1 = [[Sᵢ[i] for Sᵢ in Sᵢs] for i in 1:length(Sᵢs[1])]
        ST = [[Tᵢ[i] for Tᵢ in Tᵢs] for i in 1:length(Tᵢs[1])]

        function calc_ci(x, mean = nothing)
            alpha = (1 - method.conf_level)
            std(x, mean = mean) / sqrt(length(x))
        end
        S1_CI = map(calc_ci, S1)
        ST_CI = map(calc_ci, ST)

        if 2 in method.order
            size__ = size(Sᵢⱼs[1])
            S2_CI = Array{T}(undef, size__)
            Sᵢⱼ = Array{T}(undef, size__)
            b = getindex.(Sᵢⱼs, 1)
            Sᵢⱼ[1] = b̄ = mean(b)
            S2_CI[1] = calc_ci(b, b̄)
            for i in 2:length(Sᵢⱼs[1])
                b .= getindex.(Sᵢⱼs, i)
                Sᵢⱼ[i] = b̄ = mean(b)
                S2_CI[i] = calc_ci(b, b̄)
            end
        end
        Sᵢ = reshape(mean.(S1), size_...)
        Tᵢ = reshape(mean.(ST), size_...)
    else
        Sᵢ = Sᵢs[1]
        Tᵢ = Tᵢs[1]
        if 2 in method.order
            Sᵢⱼ = Sᵢⱼs[1]
        end
    end
    if isnothing(y_size)
        _Sᵢ = Sᵢ
        _Tᵢ = Tᵢ
    else
        f_shape = let y_size = y_size
            x -> [reshape(x[:, i], y_size) for i in 1:size(x, 2)]
        end
        _Sᵢ = f_shape(Sᵢ)
        _Tᵢ = f_shape(Tᵢ)
    end
    return SobolResult(_Sᵢ,
        nboot > 1 ? reshape(S1_CI, size_...) : nothing,
        2 in method.order ? Sᵢⱼ : nothing,
        nboot > 1 && 2 in method.order ? S2_CI : nothing,
        _Tᵢ,
        nboot > 1 ? reshape(ST_CI, size_...) : nothing)
end

And the following is a small test code that generates NaNs and runs gsa.

# Tests for nan handling
# Make a differential equation problem that can be divergent. 

using Revise, Test, OrdinaryDiffEq, GlobalSensitivity, Plots, QuasiMonteCarlo, Statistics

function f(du, u, p, t)
    du .= p[1] * u
    return du
end

# Single output case

u0 = [1.0]
p = [50] # diverges
tspan = (0.0, 20.0)
t = collect(range(0, stop = 10, length = 200))
prob = ODEProblem(f, u0, tspan, p)

f1 = let prob = prob, t = t
    function (p)
        prob1 = remake(prob; p = p)
        sol = solve(prob1, Tsit5(); saveat = t)
        if sol.retcode == ReturnCode.Success
            return mean(Array(sol))
        else
            return NaN
        end
    end
end

samples = 100
lb = [20]
ub = [70]
sampler = SobolSample()
A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)

m = gsa(f1, Sobol(), A, B; dropnan = true)
@test !any(isnan.(m.S1))
@test !any(isnan.(m.ST))

# Multi output case

f1 = let prob = prob, t = t
    function (p)
        prob1 = remake(prob; p = p)
        sol = solve(prob1, Tsit5(); saveat = t)
        if sol.retcode == ReturnCode.Success
            return [mean(Array(sol)), std(Array(sol))]
        else
            return [NaN, NaN]
        end
    end
end

m = gsa(f1, Sobol(), A, B; dropnan = true)

@test !any(isnan.(m.S1))
@test !any(isnan.(m.ST))

duodenum96 avatar Jan 07 '25 20:01 duodenum96

Seems like a reasonable approach.

ChrisRackauckas avatar Jan 22 '25 18:01 ChrisRackauckas

If okay, I can open a PR for this, plus implementing the nan-dropping for other GS estimation methods. Never done that so far but I'm willing to try.

duodenum96 avatar Jan 22 '25 18:01 duodenum96

That would be helpful. I don't think any of the maintainers are actively planning to solve this but agree that this would be a nice thing to have

ChrisRackauckas avatar Jan 22 '25 20:01 ChrisRackauckas

I would very strongly recommend against such an approach. Removing samples/trajectories from Sobol's method will break any theoretical guarantee as these depend on the structure of the Sobol sequence, see e.g. also https://github.com/SciML/GlobalSensitivity.jl/pull/167#discussion_r1620234992 and the links therein.

devmotion avatar Apr 06 '25 15:04 devmotion

That's specifically talking about Sobol QMC though. We could just default to more robust QMC samplers that have a clearer convergence on subsets of the domain like Latin Hypercube. Sobol only makes sense because of its restarting property and we really don't exploit that here.

ChrisRackauckas avatar Apr 06 '25 16:04 ChrisRackauckas

I was planning to start working on this this Tuesday, finally found some time. If it works only on certain sampling schemes, maybe we can put a warning? If theoretically completely unfeasible, then we can close the issue. I don't have any theoretical information about global sensitivity beyond what I remember from Chris's SciML book, so I'm okay with either.

duodenum96 avatar Apr 06 '25 18:04 duodenum96

I would throw a hard-error (or rather just continue returning NaN) for Sobol sequences.

We could just default to more robust QMC samplers that have a clearer convergence on subsets of the domain like Latin Hypercube. Sobol only makes sense because of its restarting property and we really don't exploit that here.

Based on the literature I'm aware of, Sobol sequences perform better and are generally the recommended sampling scheme for variance-based global sensitivity analysis, e.g. https://doi.org/10.1051/meca:2007051, https://doi.org/10.1016/j.mbs.2021.108593, https://doi.org/10.1016/j.cpc.2011.12.015. So IMO it would be strange if GlobalSensitivity would change the default sampling scheme to something else (but you can already provide completely arbitrary samples if you want to).

devmotion avatar Apr 06 '25 18:04 devmotion

I don't like this entire direction at all. The purpose of all of this is to approximate integrals. We should just make an IntegralProblem and solve it. Then things like VEGAS or SUAVE would be adaptive quasi-monte carlo methods, so they would do this in a way that gets the higher order convergence and adaptivity. I think ultimately the approach of giving the points to the user is bound to have issues.

ChrisRackauckas avatar May 03 '25 06:05 ChrisRackauckas