DFTK.jl icon indicating copy to clipboard operation
DFTK.jl copied to clipboard

Figure out an API for SCF / ground state algorithms

Open LaurentVidal95 opened this issue 2 years ago • 9 comments

Dear DFTK people,

We looked at the src/scf dir with @epolack and we think that this part of the code could be re-arrange to allow more flexibility and to be a little bit clearer. Some methods are scattered in different files. For now, it seems that the code doesn't quite match the formulation of the SCF written in @mfherbst's talk at the DFTK summer school which I find super clear:

\rho_{n+1} = \rho_n + \alpha P^{-1}(D(V(\rho_n)) - \rho_n)

As I understand it the steps are:

  1. Compute DIIS/Anderson/other acceleration technique (currently in potential_mixing.jl and scf_solvers.jl) density with:
\tilde{\rho_n} = {\rm DIIS}(\rho_n, \cdots, \rho_{n-{\rm depth}})
  1. Apply fix point map (currently scf_solvers.jl)
D(V(\tilde{\rho_n)}) 
  1. Apply preconditionner (currently in mixing.jl)
P^{-1}(D(V(\tilde{\rho_n)}) -\rho_n)
  1. Apply Damping (currently in scf_solver.jl)
\rho_{n+1} = \rho_n + P^{-1}(D(V(\tilde{\rho_n)}) -\rho_n)

These four steps are done in the fixpoint map of self_consistent_field.jl.

We would suggest to have four keywords in the self_consistent_field routine corresponding to each of these four steps. This would prevent rewriting a solver to simply modify the acceleration part, or any of the four steps. The organization of the scf dir would be something like:

scf/
* self_consistent_field.jl (main)
* acceleration.jl (step 1)
* scf_solver.jl (step 2)
* preconditioning.jl (or mixing.jl) (step 3)
* damping.jl (or post_process.jl to account for level-shifting stuff ?) (step 4)

We could discuss that at the next DFTK meeting! What do you think about that idea ?

Best to all, Laurent

PS: The direct minimization might even be integrated with SCF routines in a general compute_ground_state routine or something but that is another question.

LaurentVidal95 avatar Nov 23 '22 12:11 LaurentVidal95

in #33 @antoine-levitt suggested:

We really need to do this at one point. Something like

solver = Solver(basis, DensityMixing(; mixing, eigensolver); occupation_scheme, rho_init, psi_init)
scfres = solve(solver)

? That seems to be the pattern used in the SciML solvers.

mfherbst avatar Nov 23 '22 13:11 mfherbst

I think we should do something along this lines. In SciML they usually have the first arg define the problem, the remaining args the algorithms. So that would indeed suggest something like

solve(basis, DensityMixing(; mixing, eigensolver); occupation_scheme, rho_init, psi_init)

Maybe wrapping basis in a DftGroundStateProblem, but I'm not so sure about that. We could also not call it solve and use more something like ground_state similar to what has been suggested above. But the idea to use more structs for the algorithmic parts is very good.

mfherbst avatar Nov 23 '22 13:11 mfherbst

Also, is there a specific reason for the NSsolve method to be the default one? (I won't be able to attend tomorrow to discuss this, but if you agree on the broad idea, I could start working on this next week with Laurent.)

epolack avatar Nov 23 '22 13:11 epolack

Regarding the steps you mention @LaurentVidal95 ... it works out nicely for the density mixing, but if you want to support both density and potential mixing it becomes a bit more messy. Our current solution, however, has a lot of duplicated code and is probably not the best, I agree. Part of the reason is the NLSolve, which works for density mixing but not for potential mixing.

So that suggests to remove NLSolve (which I agree @epolack can be done). However, then we should make sure that the sort of tricks I use in my workshops (you get an F(ρ) = D(V(ρ)) and can do with it what you want) still work.

mfherbst avatar Nov 23 '22 13:11 mfherbst

The reason why NLSolve is still the default by the way (beyond it being historic reasons) is that it has a slightly more performant Anderson that @antoine-levitt wrote. In particular it uses less copies and is less wasteful in memory. If we port that over to DFTK's Anderson than it brings no benefit over our own implementation.

I would suggest before we even start coding something we brainstorm an interface and try to see how be best refactor. But to discuss this we will need most likely more time than the weekly meeting.

Also I think a lot there is pretty subtle as we need to ensure by solid testing, that we don't accidentally break something or regress performance as there are a bunch of places where heuristics come into play.

mfherbst avatar Nov 23 '22 13:11 mfherbst

So that suggests to remove NLSolve (which I agree @epolack can be done).

NLSolve is also one of the main contributors to the super high latency (time to first SCF) we have, and is not super maintained, so getting rid of it as a default option would be good, ideally. Then, we can start precompiling more stuff and hopefully get a reasonable latency. @pkofod has been promising a new and improved NLSolvers for ages now, but I'm not sure what the status is.

The reason why NLSolve is still the default by the way (beyond it being historic reasons) is that it has a slightly more performant Anderson that @antoine-levitt wrote. In particular it uses less copies and is less wasteful in memory.

Main difference is the use of qr update/downdate (ie adding/removing columns to a qr factorization without updating it). This is implemented in nlsolve as https://github.com/JuliaNLSolvers/NLsolve.jl/blob/master/src/nlsolve/utils.jl. To be honest I'm not sure it's noticeable, so we can probably just set the current algorithm as default without too much of a performance penalty.

We looked at the src/scf dir with @epolack and we think that this part of the code could be re-arrange to allow more flexibility and to be a little bit clearer. Some methods are scattered in different files

Yeah, moving stuff between files could be a good idea. Eg AndersonAcceleration is in potential_mixing but should be in scf_solvers?

Main reason I'm not super enthusiastic about all of this is that it's a lot of not super fascinating work that will only marginally improve clarity and will not actually get us new features. I'm especially not sure about adding new levels of indirection without improving code reuse. But getting rid of nlsolve should simplify things, at least.

antoine-levitt avatar Nov 23 '22 14:11 antoine-levitt

Nice thanks for the answers ! I think indeed that the goal is not to spend a lot of time that could be devoted to more interesting work on the matter. But on the other hand if a quick re-organization can help to implement new stuff more easily without breaking performances, that could be interesting ! If we have a meeting tomorrow 13h30 I'll be there to discuss that :+1:

LaurentVidal95 avatar Nov 23 '22 15:11 LaurentVidal95

Main reason I'm not super enthusiastic about all of this is that it's a lot of not super fascinating work that will only marginally improve clarity and will not actually get us new features.

I planned to work a bit on acceleration methods (using descriptors for good initial guess) and this question happened quite naturally. But that can be done without any structural change in the code.

epolack avatar Nov 23 '22 17:11 epolack

Main difference is the use of qr update/downdate (ie adding/removing columns to a qr factorization without updating it).

Yes that and we are really wasteful with memory (building the matrix M to invert completely from scratch each time by doing splatting into hcat), which I think is the real issue.

mfherbst avatar Nov 23 '22 21:11 mfherbst