DFTK.jl
DFTK.jl copied to clipboard
Enhance user experience with MPI
This PR aims at enhancing the user experience when using MPI parallelization by not stopping when n_ranks > n_kpt.
The current solution of stopping execution with an error message is not optimal. Indeed, for an arbitrary system, it is not trivial to know the number of irreducible K-points in advance. This is particularly annoying when calculations are launched in an automated fashion, where the only safe bet is to not use MPI at all.
The solution proposed here is simple. When a calculation is started with more MPI ranks than K-points, a new MPI communicator is created. This communicator has as many ranks as there are K-points, while the remaining processes exit the program.
While this may lead to idling CPU time, I believe that not crashing improves the experience. Moreover, a warning is printed describing the situation to the user, so that they can optimize their next run. I would also argue that this is not worse than under-parallelizing in some instances where the number of K-point is not a multiple of the number of MPI ranks (maybe we should also issue a warning in such a case?).
I also took the opportunity to fix testing with the :mpi tag. The current test on parse(Bool, get(ENV, "CI", "false")) in the PlaneWaveBasis creation, which essentially disables MPI testing on the CI, has been removed. Instead, all tests involving kgrids with a single k-point have received the :dont_test_mpi tag. The number of MPI ranks for MPI testing is also hardcoded to 2: because all tests are run as a single calculation, killing processes when n_ranks > n_kpt would make the tests hang. This also fixes local testing via Pkg.test("DFTK"; test_args = ["mpi"]).
BTW the test failures in the "normal" tests seem fake (because of a too tightly chosen test tolerance, I think), but the cancelled MPI test points to an issue in your implementation (could be related to the exit() btw.)
Calling exit from within PlaneWaveBasis is super unexpected and can easily lead to spurious bugs. I'm thinking of a case where one uses MPI for DFTK and to run other things and all of a sudden part of the processes in the global communicator are just gone and the program hangs or does undefined stuff.
Yes, I agree actually. In fact, I was facing this exact problem when trying to run the tests with more than 2 MPI ranks.
Ideally, for a robust behavior, the extra MPI ranks should not exit the program, but wait at a synchronization barrier at the end of execution. I am not quite sure how to implement that in a light way, without an arbitrarily large if block after the creation of the sub-communicator. I will give it some thoughts.
There are 2 important questions to be addressed on the form I think:
- Should an input script look any different when a parallel run is intended?
- Should the user have to think about creating a sub-communicator or setup parallelization when writing their input?
I would tend to say no to both of the above. I'll try to come up with something along these lines.
The latest commit brings major simplifications to this MPI issue. Instead of building sub-communicators and killing processes in a dangerous fashion, I propose to duplicate some k-points over the empty MPI ranks. The weights are adjusted in order to keep exactness of the results. A warning is issued as well.
This is the lightest fix I could think of. It is also safe and robust, and the user does not have to think about parallelism in their script. The only potential issue I see is: if the user queries for the k-point mesh after the calculation, they could get a non irreducible k-mesh with duplicates. One possible solution would be to add an extra field to the basis with the original mesh, or write a function that always returns the irreducible mesh.
Regarding the tests: I had in mind for a long time to actually switch the logic. Instead of "disabling" MPI tests where needed I think it's better to annotate those, which do support MPI instead. (I.e. instead of a :dont_test_mpi have a simple :mpi on the counter-set). If it's too much work, leave it as is and we do a follow-up ...
I don't mind changing from :dont_test_mpi to :mpi tags in the tests. However, if this PR is accepted, I believe all tests could actually be run in parallel (to be checked by me). In such a case, we could get rid of this tag completely.
BTW the test failures in the "normal" tests seem fake (because of a too tightly chosen test tolerance, I think), but the cancelled MPI test points to an issue in your implementation (could be related to the exit() btw.)
In my experience, it seems that the anderson.jl test is particularly unstable.
I'm in favour of this. @abussy Please update and then we can merge this !
@antoine-levitt Objections ?
@abussy The only thing that we have to be careful with is unfolding the k-point mesh when not using symmetries and during IO operations.
I haven't looked in details but it looks reasonable, go ahead!
Notes:
- Added a
n_irreducible_kpointsfiled to thePlaneWaveBasis--> used in I/O so that only irreducible k-points are considered - Added
irreducible_kcoords()andirreducible_kweights()functions for the same reasons - BZ unfolding currently not implemented with MPI, so duplicated k-point won't affect it. Added an extra error message in case MPI support gets implemented
- Rebased for compatibility with master branch
Implemented most concerns raised by @mfherbst during his review. I'd like to raise two points for discussion:
-
I gave some thoughts to changing the
:dont_test_mpitest tag to:mpi. I think that, by default, DFTK should be MPI compatible. Any method of function not implemented for MPI is then an exception, and using the:dont_test_mpitest flag makes that clear. I am afraid that changing to:mpiinstead would lead to a lower impact of the tag. -
When reworking the
irreducible_kweights_global()function inPlaneWaveBasis.jl, I realised that in some cases, duplicated k-points already exist (even in non-parallel runs), and I refrained from using functions such asfindall(). In particular, when calculating band structures with thecompute_bands()function, the K-point path might go through special points multiple times, on purpose. As a result, I kept the logic of looking for replicated k-points inbasis.kcoords_global[basis.n_irreducible_kpoints+1:end]explicitly. I might possibly change nomenclature though, as the original set of K-points is not necessarily the irreducible one.
I gave some thoughts to changing the :dont_test_mpi test tag to :mpi. I think that, by default, DFTK should be MPI compatible. Any method of function not implemented for MPI is then an exception, and using the :dont_test_mpi test flag makes that clear. I am afraid that changing to :mpi instead would lead to a lower impact of the tag.
Hmm. Yes, but I think that's ok, because it's fine if only the core is tested with MPI and not the rest. But let's think about this some more and leave as is for now.
, I realised that in some cases, duplicated k-points already exist (even in non-parallel runs), ... In particular, when calculating band structures with the compute_bands() function,
True, good point.
~But still there is a problem. The coordinates can be the same if the spin is different. I'm not sure you handle this proplerly right now ?~
Update: Ah no, this is before the entire spin business.