DFTK.jl Possible grid issue with numerical pseudopotentials

Figure_1

using DFTK
using Downloads
using PyPlot

# Here, we will use Perdew-Wang LDA PSP from PseudoDojo

URL_UPF = "https://raw.githubusercontent.com/JuliaMolSim/PseudoLibrary/main/pseudos/pd_nc_sr_lda_standard_04_upf/Li.upf"

# We load the HGH and UPF PSPs using `load_psp`, which determines the
# file format using the file extension.

psp_hgh = load_psp("hgh/lda/li-q3.hgh");
path_upf = Downloads.download(URL_UPF, joinpath(tempdir(), "Li.upf"))
psp_upf = load_psp(path_upf);

Es = range(0,200,length=1000)
qs = sqrt.(2 .* Es)
# toplot(q) = abs(DFTK.eval_psp_projector_fourier(psp_upf, 1, 2, q))
toplot(q) = abs(DFTK.eval_psp_local_fourier(psp_upf, q))
PyPlot.semilogy(Es, toplot.(qs))
# display(Plots.plot(Es, toplot.(qs), yaxis=:log))
xlabel("E")
ylabel("V")

This is consistent with https://github.com/JuliaMolSim/DFTK.jl/pull/741#discussion_r1001872872.

Oct 21 '22 15:10 antoine-levitt

@azadoks

Oct 21 '22 15:10 antoine-levitt

Not entirely sure what's happening here, what do you mean exactly by a "grid issue"?

Oct 21 '22 15:10 azadoks

One thing that comes to mind is the radial grid cutoff of the projectors.

Here's a note on cutoff_radius_index from here:

The projector is assumed to be zero beyond this point. If it is not specified, it is assumed to be equal to the mesh size. NOTE: Not specifying cutoff_radius can introduce a considerable numerical noise in some integrals, especially in the PAW case.

If it's not present in the UPF file, my parser will error because the default value of get_attr is zero (possibly something to fix in the parser for better agreement with what QE does).

Oct 21 '22 15:10 azadoks

On a perfect grid, the potential and the projectors should decay quickly, consistently (they are Fourier transforms of very smooth functions). The fact that they don't can be the sign of numerical errors. This is more pronounced on the projectors, where there are clear periodic oscillations that look very much like Gibbs oscillations

Oct 21 '22 15:10 antoine-levitt

One thing that comes to mind is the radial grid cutoff of the projectors.

Probably not, the projectors appear to decay to numerical zero?

Oct 21 '22 15:10 antoine-levitt

Some speculation:

I'd guess that the pseudo generation code / code operator has a reasonable guess about what the converged energy cutoff is for a given pseudo, and they set the grid to be no finer than necessary (it would make the pseudo more expensive to evaluate).

I don't think this is an issue with our implementation necessarily, but rather a choice made by the creator of the pseudo. At the end of the day, the pseudo converges at 30~40 Ha, and even PseudoDojo's strictest recommended cutoff is 41 Ha.

Oct 21 '22 15:10 azadoks

Probably not, the projectors appear to decay to numerical zero?

Yeah, they should definitely do that. Something else I should consider putting into the parser even though it's probably not necessary.

Oct 21 '22 15:10 azadoks

We could try to implement cubic splines for UPFs that have a linear mesh (like this Li pseudo). I remember you saying that this would probably make things worse though, could you explain why?

Oct 21 '22 17:10 azadoks

We could try to implement cubic splines for UPFs that have a linear mesh (like this Li pseudo). I remember you saying that this would probably make things worse though, could you explain why?

So this is a funny argument if you've never heard it. Consider the problem of integrating periodic functions on [0, 1] instead (it's more convenient mathematically), and assume you're given the values of a smooth and periodic function f on an equidistant grid xi=i/N, i = 0...N-1. The humble quadrature int f ~= 1/N sum_i f(xi) is already super effective. The reason is that it's exact if f(x) is a trigonometric polynomial of degree less than N; therefore, the error in the integral is of the order of the Nth Fourier coefficient of f, which is exponentially small in N if f is super smooth (technically, analytic on a strip in the complex plane). OTOH you could decide to interpolate your points with a spline and integrate the resulting interpolated function. Before I thought that this would only be accurate to third (or fourth) order in 1/N and therefore worse than the simple sum, but in fact I'm pretty sure if you do it properly, taking into account the periodic boundary condition, it would be exactly equivalent to the simple quadrature (the reason being that the resulting quadrature rule is a linear functional on the values f(xi) that is translation invariant, and therefore it has to be 1/N ∑ f(xi)). So it wouldn't hurt, it would just do nothing.

Now if the function is not periodic but decays at infinity (say, a gaussian) and you're integrating on an interval [-L,L] outside which the function is numerically zero, the above argument still works, essentially because you can consider the function as being periodic on [-L,L]. In our case, we're integrating from 0 to L, which is why I insisted on considering it instead as one half of an integration from -L to +L, extending the function by parity (which results in the origin having half weight). This argument only works if the function extended by parity is super smooth (eg it has to behave as r^2 near the origin, not as r). If the function is not smooth, the integration will have Gibbs-like oscillations. Maybe that's the issue here; in this case, maybe an abinit-like correction scheme is appropriate...

Good point on 40Ha being considered too big anyway. Still it's strange and a bit unsatisfactory mathematically... Let's do orders of magnitude. Typically, I would expect V(q) (or β(q)) to be of order C exp(-c q), with c of order 1. The error I would expect to be of order of the Fourier coefficient of V(r) exp(iqr) at k=2π/Δr, where Δr is the mesh size, so something like C exp(-c|q-2π/Δr|). The point at which you start to notice grid errors is about when q=π/Δr (not a particularly surprising result in hindsight...). If Δr = 0.01 as is the case here, that's qmax = 314, so E = qmax^2/2 = 50k. Even assuming the constants are not 1, there should still be some room to get extremely accurate results at Δr=0.01! But maybe nobody really cares about this regime of accuracy, and the functions are only accurate to some finite order, say Δr^3. Then, exp(-q) is of order Δr^3 much sooner, at q=13. Which matches pretty well what we see...

Oct 21 '22 19:10 antoine-levitt

This argument only works if the function extended by parity is super smooth (eg it has to behave as r^2 near the origin, not as r). If the function is not smooth, the integration will have Gibbs-like oscillations. Maybe that's the issue here; in this case, maybe an abinit-like correction scheme is appropriate...

In fact, rβ is not really smooth near 0; the value at r=0 is always a bit funky, and that could be messing us up.

Maybe this could be patched up with an appropriate extrapolation to r=0? There might be a good reason that it's like this though.

But maybe nobody really cares about this regime of accuracy, and the functions are only accurate to some finite order, say Δr^3. Then, exp(-q) is of order Δr^3 much sooner, at q=13. Which matches pretty well what we see...

Could this have something to do with this comment in the Abinit implementation?

!Interpolate projectors onto new grid if called for
!Cubic polynomial interpolation is used which is consistent
!with the original interpolation of these functions from
!a log grid to the input linear grid.

It seems like ONCVPSP uses a log-grid internally and interpolates to a linear grid for writing the PSP files using cubic splines. This would give finite accuracy to Δr^3, right?

Oct 22 '22 08:10 azadoks

In fact, rβ is not really smooth near 0; the value at r=0 is always a bit funky, and that could be messing us up.

That plot is skewed by the log scale and the grid finiteness, but is actually completely fine when you look at it in lin space. The integrand (r^2 beta(r)) does appear to be a nice even function. Two potential issues:

It goes to zero a bit funny, but it's not that bad
When l=1, the spherical bessel is odd, so the integrand looks like r^3 at the origin But https://github.com/JuliaMolSim/DFTK.jl/pull/741#discussion_r1001908825 shows that it goes to zero pretty fast in Fourier space, so I'm not overly concerned about it

I'm more concerned about the potential, for which the integrand sin(qr) r^2 vloc(r) does not look like a nice even function at all, and goes as r at the origin (this is not the fault of the correction term). Do you happen to have abinit's improved trapezoidal rule handy to see how the plot in the first post of this issue looks like?

Oct 24 '22 09:10 antoine-levitt

The integrand for for the potential is really j_0(qr) r^2 (vloc(r) + Z/r) = sin(qr)/qr r^2 (vloc(r) + Z/r).

begin
    using DFTK
    using Downloads
    using Plots

    URL_BASE = "https://raw.githubusercontent.com/JuliaMolSim/PseudoLibrary/main/pseudos/"
end

function ctrap_abinit(ff::AbstractVector{T}, hh::T)::T where {T <: Real}
    imax = length(ff)

    if imax >= 10
        endpt = (
            T(23.75) * (ff[1] + ff[imax    ]) + 
            T(95.10) * (ff[2] + ff[imax - 1]) +
            T(55.20) * (ff[3] + ff[imax - 2]) +
            T(79.30) * (ff[4] + ff[imax - 3]) +
            T(70.65) * (ff[5] + ff[imax - 4])
        ) / T(72)
        ir2 = imax - 5
        s = zero(T)
        if ir2 > 5
            for ir in 6:ir2
                s += ff[ir]
            end
        end
        ans = (s + endpt) * hh
    elseif nf >= 8
        endpt = (
            17(ff[1] + f[imax    ]) +
            59(ff[2] + f[imax - 1]) +
            43(ff[3] + f[imax - 2]) +
            49(ff[4] + f[imax - 3])
        ) / T(48)
        s = zero(T)
        if imax == 9
            s = ff[5]
        end
        ans = (sum + endpt) * hh
    elseif nf == 7
        ans = (17(ff[1] + ff[7]) + 59(ff[2] + ff[6]) + 43(ff[3] + ff[5]) + 50(ff[4])) / T(48) * hh
    elseif nf == 6
        ans = (17(ff[1] + ff[6]) + 59(ff[2] + ff[5]) + 44(ff[3] + ff[4])            ) / T(48) * hh
    elseif nf == 5
        ans = (  (ff[1] + ff[5]) +  4(ff[2] + ff[4]) +  2(ff[3]       )             ) /  T(3) * hh
    elseif nf == 4
        ans = ( 3(ff[1] + ff[4]) +  9(ff[2] + ff[3])                                ) /  T(8) * hh
    elseif nf == 3
        ans = (  (ff[1] + ff[3]) +  4(ff[2]       )                                 ) /  T(3) * hh
    elseif nf == 2
        ans = (  (ff[1] + ff[2])                                                    ) /  T(2) * hh
    elseif nf == 1
        ans = (  (ff[1]       )                                                     )         * hh
    end
    return ans
end

function simpson_qe(func::AbstractVector{T}, rab::Vector{T})::T where {T <: Real}
    mesh = length(func)
    asum = zero(T)
    r12 = T(1/3)
    for i in 2:mesh-1
        fct = T(abs(i % 2 - 2) * 2)
        asum = asum + fct * func[i] * rab[i]
    end
    if mesh % 2 == 1
        asum = (asum + func[1] * rab[1] + func[mesh] * rab[mesh]) * r12
    else
        asum = (asum + func[1] * rab[1] - func[mesh-1] * rab[mesh-1]) * r12
    end
    return asum
end

function simpson_qe(func::AbstractVector{T}, rab::T)::T where {T <: Real}
    simpson_qe(func, zeros(T, length(func)) .+ rab)
end

function ctrap_vloc_q(psp::PspUpf{T}, q::T)::T where {T <: Real}
    integrand = sin.(q .* psp.rgrid) .* (psp.rgrid .* psp.vloc .+ psp.Zion)
    s = ctrap_abinit(integrand, psp.drgrid[1])
    4T(π) * (s - psp.Zion / q) / q
end

function simpson_vloc_q(psp::PspUpf{T}, q::T)::T where {T <: Real}
    integrand = sin.(q .* psp.rgrid) .* (psp.rgrid .* psp.vloc .+ psp.Zion)
    s = simpson_qe(integrand, psp.drgrid[1])
    4T(π) * (s - psp.Zion / q) / q
end

psp = let
    url = joinpath(URL_BASE, "pd_nc_sr_lda_standard_04_upf/Li.upf")
    path = Downloads.download(url, joinpath(tempdir(), "Li.upf"))
    load_psp(path)
end;

let
    Egrid = range(0, 400, length=1000)
    qgrid = sqrt.(2 .* Egrid)
    vloc_ctrap = abs.(ctrap_vloc_q.(psp, qgrid))
    vloc_simpson = abs.(simpson_vloc_q.(psp, qgrid))
    vloc_dftk = abs.(DFTK.eval_psp_local_fourier.(psp, qgrid))
    plot(Egrid, [vloc_ctrap, vloc_simpson, vloc_dftk],
         labels=["ctrap (ABINIT)" "simpson (QuantumEspresso)" "dot (DFTK)"],
         xlabel="E [Hartree]", ylabel="V(E)", yscale=:log10,
         linewidth=2)
    xticks!(0:20:400)
    yticks!([1e2, 1e1, 1e0, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9, 1e-10])
end

Oct 24 '22 11:10 azadoks

I couldn't resist. Above: local potential using QE's Simpson's method and Abinit's corrected trapezoidal method.

Oct 24 '22 12:10 azadoks

Very nice! By comparison, on the projectors there's no difference, as should be. OK so it's quite clear what's going on now: the integrand is linear near zero, and therefore the standard method is inefficient. The solution is to either correct specifically for this via a Euler-MacLaurin formula (I believe that's what abinit does) or higher-order quadrature (what QE does). We should just implement abinit's corrected method (the simple version : only correcting at the origin, and assuming n>=10). If you feel like it feel free to do it, otherwise I'll do it sometime.

Oct 24 '22 12:10 antoine-levitt

See https://en.wikipedia.org/wiki/Euler%E2%80%93Maclaurin_formula#Approximation_of_integrals, where the h^4/720 term looks similar to abinit's.

Oct 24 '22 13:10 antoine-levitt

Ok cool, yeah probably not until the end of the week at the earliest. One important detail is that the UPFs use one of three kinds of grid: linear, log (type 1), log (type 2) (see here).

I think that ONCVPSP uses the linear grid (just like the PSP8 format), while pseudos from ld1.x and things converted to UPF (like the HGH pseudos) are probably on one of the two log grids.

Abinit / Euler-MacLaurin would only work out-of-the-box for the linear grid. On the other hand, QE's higher-order quadrature works for all three.

Oct 24 '22 13:10 azadoks

Ah, yes, I don't know if Euler-MacLaurin formulas can be derived for nonuniform grids...

Oct 24 '22 14:10 antoine-levitt

So then better maybe to use QE's Simpson's method? Otherwise, we could implement both and set up an interface for asking the pseudo what kind of grid its on. This could save us a bit of time for pseudos on uniform grids.

Oct 25 '22 14:10 azadoks

Might as well go with Simpson always... One neat way of writing it is by just changing the integration weights, ie use an array weights, which is computed from rgrid, where you currently use drgrid. That way performance is the same.

Oct 25 '22 14:10 antoine-levitt

Ah true, that could be pretty good. In the end, I just implemented a cleaner simpson function that seems to have fair performance (a bit better than QE's implementation above in small tests). I'll have to do some benchmarks and profiling. It might be a bit more readable to call simpson rather than precompute some cryptic weights.

Precomputing the weights and using them everywhere like you suggest would be good too, but the projectors are cutoff at some arbitrary point. I'm not sure if it's always even on an even index, so things could get a bit messy

I'll do some profiling tomorrow and see if the function is good enough to use everywhere.

Oct 25 '22 15:10 azadoks

Where "Simpson Julia" is what is used for the convergence (Ecut 140 Ha is the reference):

Oct 25 '22 16:10 azadoks

Do you have the new and old version of the Ecut convergence that you can put on the same plot easily? Looks like it's a bit better

Oct 25 '22 17:10 antoine-levitt

I haven’t followed the discussion here and not sure it's related, but there was a (unresolved) discussion about integration routines in QE: https://gitlab.com/QEF/q-e/-/issues/310

There are multiple routines for integration from 0 to inf in QE. One is Simpson, and the other (simpson_cp90) is open simpson where the correction applies only to the boundary points: https://gitlab.com/QEF/q-e/-/blob/develop/upflib/simpsn.f90#L61

function simpson_qe_cp90(func::AbstractVector{T}, rab::Vector{T})::T where {T <: Real}
    mesh = length(func)
    asum = zero(T)
    r12 = T(1/3)
    for i in 5:mesh-4
        asum += func[i] * rab[i]
    end
    asum += (func[1] * rab[1] + func[mesh-0] * rab[mesh-0]) * 109 / 48
    asum += (func[2] * rab[2] + func[mesh-1] * rab[mesh-1]) * -5 / 48
    asum += (func[3] * rab[3] + func[mesh-2] * rab[mesh-2]) * 63 / 48
    asum += (func[4] * rab[4] + func[mesh-3] * rab[mesh-3]) * 49 / 48
    return asum
end

function simpson_qe_cp90(func::AbstractVector{T}, rab::T)::T where {T <: Real}
    simpson_qe_cp90(func, zeros(T, length(func)) .+ rab)
end

function simpson_cp90_vloc_q(psp::PspUpf{T}, q::T)::T where {T <: Real}
    integrand = sin.(q .* psp.rgrid) .* (psp.rgrid .* psp.vloc .+ psp.Zion)
    # NOTE: for simpson_qe_cp90, one should skip r=0.
    s = simpson_qe_cp90(integrand[2:end], psp.drgrid[1])
    4T(π) * (s - psp.Zion / q) / q
end

Oct 26 '22 01:10 jaemolihm

Do you have the new and old version of the Ecut convergence that you can put on the same plot easily? Looks like it's a bit better

I don't, but I can cook it up.

I haven’t followed the discussion here and not sure it's related, but there was a (unresolved) discussion about integration routines in QE: https://gitlab.com/QEF/q-e/-/issues/310

Yes, this does seem relevant, thanks!

I saw that there was a "CP" version of Simpson's method when I was digging around, but I never really gave it a look.

# NOTE: for simpson_qe_cp90, one should skip r=0.

Aha, should be careful with this because the grid doesn't always start at r=0.

Here's a quick comparison between the pw.x and cp.x integrators:

Oct 26 '22 08:10 azadoks

Precomputing the weights and using them everywhere like you suggest would be good too, but the projectors are cutoff at some arbitrary point. I'm not sure if it's always even on an even index, so things could get a bit messy

This is likely fine as the projector is supposed to be zero after the cutoff, and be smooth at that point

It might be a bit more readable to call simpson rather than precompute some cryptic weights.

I guess it's a matter of perspective, but doing dot(simpson_weights(r), f.(r)) with simpson_weights building the weights is not that complicated, as it's relatively common to see quadrature rules as being weighted sums. The nice thing about it is that if you don't want to precompute f.(r) for some reason but compute them on the fly, you can manually do s += f(r) * weights[ir]. But by all means if you prefer calling a function directly go for it, I don't have a strong opinion.

There are multiple routines for integration from 0 to inf in QE. One is Simpson, and the other (simpson_cp90) is open simpson where the correction applies only to the boundary points: https://gitlab.com/QEF/q-e/-/blob/develop/upflib/simpsn.f90#L61

Thanks! The 48 denominator makes me think it's using https://en.wikipedia.org/wiki/Euler%E2%80%93Maclaurin_formula#Approximation_of_integrals at one order less than abinit.

By curiosity, I've tried to look up that formula in the given source (numerical recipes) and couldn't in my copy of numerical recipes 3rd edition. Then I went down to the library and tracked down the first edition.

Here is the first edition 20221026_100916 (1)

And the third

20221026_101132 (1)

So I wouldn't necessarily trust that formula very far... I don't have a strong opinion on which is better between the simpson method with alternated 1/3 - 2/3 coefficients and the corrected trapezoidal rule (as long as, you know, it's correct :-p)

Oct 26 '22 08:10 antoine-levitt

Wait, no, actually both versions are not necessarily contradictory... gah this is confusing. Euler-MacLaurin looks simpler!

Oct 26 '22 08:10 antoine-levitt

Here's a plot using the dot / trapezoid method, it looks very similar. I'll double-check that this is correct, but I think it's right.

Oct 26 '22 13:10 azadoks

DFTK.jl DFTK.jl copied to clipboard

Possible grid issue with numerical pseudopotentials

DFTK.jl
DFTK.jl copied to clipboard