plumed2
plumed2 copied to clipboard
Possible slow down in htt (merging large vectors twice per step)
@gtribello I am looking for hot spots here and there and I discovered that there is a potentially expensive thing that is done twice per step with htt (once per step before).
In particular, in DomainDecomposition.cpp, the operation mergeSortedVectors
is done:
- in
DomainDecomposition::share()
(equivalent to pre-htt), for local (in the MPI sense) atoms - in
DomainDecomposition::reset()
for all (non local) atoms
Is this expected? If I just remove the call to getAllActiveAtoms
in reset()
, I get a speed up of ~ 10% with this input:
c1: CENTER ATOMS=1-20000 NOPBC
c2: CENTER ATOMS=20000-40000 NOPBC
c3: CENTER ATOMS=40000-60000 NOPBC
a: ANGLE ATOMS=c1,c2,c3
RESTRAINT ARG=a AT=0 KAPPA=1
However, the code is not working anymore correctly when we use domain decomposition. Which is the correct way of removing this unnecessary calculation? Why do we need to set to zero forces that are expected to be never used on the local processor?
Hello
I had a look at this and if you are using domain decomposition it is necessary to have that call twice per step.
There is a difference between what needs to be in unique
in DomainDecomposition::share()
and in DomainDecomposition::reset()
. In particular:
- In
DomainDecomposition::share()
unique needs to contain only the atoms in the local domain - In
DomainDecomposition::reset()
unique needs to contain all the atoms that plumed may have added force on (which Is all the atoms in all the domains).
If the DomainDecomposition is off then you can avoid calling getAllActiveAtoms
in reset. If that is not the case then I think there is no way to avoid the second call.
Addressed in #1027
I leave this open to remember that we should check the impact in MPI runs and, if necessary, address it