cp2k
cp2k copied to clipboard
Try out ELPA's new redistribution feature
During yesterday's ELPA meeting it was suggested that we try its new redistribution feature. This could have several advantages:
- we could remove our own redistribution code
- hopefully resolve #1444 as it's presumably caused by bugs in our redistribution
- always arrive at a block size that is GPU compatible
- overall better performance because ELPA picks more optimal distributions
Unfortunately, I could not find any documentation on this. @marekandreas, maybe you could help?
Hi @marekandreas, are there any updates on the new redistribution feature?
I just found one ELPA bug (?) which leads to RPA hanging for certain rank counts. Submitted an issue here:
https://github.com/marekandreas/elpa/issues/17
Basically the problem arises when changing the communicator size between ELPA calls. Although for now I don't know if the culprit is ELPA or CP2K but seeing "redistribution" in the title of this issue sounds like it is related.
Pinging a few people which I've seen mentioning ELPA hangs in various topics @oschuett @alazzaro @dev-zero @mkrack
This could be the same problem. Fortunately doesn't look hard to fix.
In particular @alazzaro I see you recently switched the 32-H2O input file to use Scalapack instead of ELPA, but with a quick fix in ELPA it runs for me with correct accuracy, so this might not be needed.
This issue was actually meant more as a feature request for replacing our own redistribution code with an upcoming ELPA feature. However, AFAIK ELPA has not yet released that feature.
The reason we're redistributing our matrices is that ELPA requires each rank to have at least one block. When there are less blocks than ranks then we even have to shrink the communicator.
In particular @alazzaro I see you recently switched the 32-H2O input file to use Scalapack instead of ELPA, but with a quick fix in ELPA it runs for me with correct accuracy, so this might not be needed.
Well, that was months ago (#1818 and #1811)... I can assume ELPA improved since then. In any case, I don't expect any benefit in those small tests by using ELPA. Personally, I'm considering ELPA as an optimization over scalapack and I enabled it only when useful...
From what I can tell this new redistribution feature was never actually released.
Nevertheless, we're now ensuring GPU-compatible block sizes on our side thanks to #2407.