librascal
librascal copied to clipboard
Why does librascal uses so much memory?
For the sample code below (I can share the full thing with structures if required, there are currently 10 structures with around 40 atoms each), I get this output:
Sample Python code
def format_mem(nbytes):
mem_mb = nbytes / 1024 / 1024
if mem_mb > 1024:
return f"{mem_mb / 1024:.4} GiB"
else:
return f"{mem_mb:.4} MiB"
print("memory before: ", format_mem(psutil.Process(os.getpid()).memory_info().rss))
soap = SphericalInvariants(
soap_type="PowerSpectrum",
interaction_cutoff=3.5,
max_radial=6,
max_angular=6,
gaussian_sigma_constant=0.3,
gaussian_sigma_type="Constant",
cutoff_smooth_width=0.5,
radial_basis="GTO",
normalize=True,
compute_gradients=True,
)
kernel = Kernel(
soap, name="GAP", zeta=1, target_type="Structure", kernel_type="Sparse"
)
managers = soap.transform(frames)
print("feature mem: ", format_mem(managers.get_features(soap).nbytes))
compressor = CURFilter(soap, pseudo_points, act_on="sample per species")
X_pseudo = compressor.select_and_filter(managers)
K_MM = kernel(X_pseudo)
K_E = kernel(managers, X_pseudo, grad=(False, False))
K_E /= regularization.energies[:, np.newaxis]
K_F = kernel(managers, X_pseudo, grad=(True, False))
K_F /= regularization.forces
K_NM = np.vstack((K_E, K_F))
print("K_MM mem: ", format_mem(K_MM.nbytes))
print("K_NM mem: ", format_mem(K_NM.nbytes))
del K_E
del K_F
print("memory used: ", format_mem(psutil.Process(os.getpid()).memory_info().rss))
return K_MM, K_NM
memory before: 66.38 MiB
feature mem: 8.344 MiB
K_MM mem: 0.01221 MiB
K_NM mem: 0.4004 MiB
memory used: 3.499 GiB
The process uses around 3.5 GiB of RAM, while features only occupy ~8MiB. Even accounting for gradients (let's say 20 neighbor per atom x 3 spatial dimension, this gives around 500 MiB of additional memory), I don't understand how the code reaches 3.5 GiB.
This issue makes it harder for me to use librascal for a large number of structures, since I now have to go to compute facilities even for (what I perceive to be) small systems, since running such code locally quickly overwhelm my RAM and starts to aggressively swap, making everything very slow.
Am I missing something here? Is there a reason the code uses so much memory, or is this something we should try to improve?
How many atomic species are present in this dataset ?
So switching the code above from computing representation for all frames:
managers = soap.transform(frames)
K_E = kernel(managers, X_pseudo, grad=(False, False))
K_E /= regularization.energies[:, np.newaxis]
K_F = kernel(managers, X_pseudo, grad=(True, False))
K_F /= regularization.forces
K_NM = np.vstack([K_E, K_F])
To computing one frame at the time
K_E = []
K_F = []
for i, frame in enumerate(frames):
managers = soap.transform([frame])
k = kernel(managers, X_pseudo, grad=(False, False))
K_E.append(k / regularization.energies[i, np.newaxis])
k = kernel(managers, X_pseudo, grad=(True, False))
K_F.append(k / regularization.forces)
K_NM = np.vstack((*K_E, *K_F))
Brings the memory usage down to 500MiB, and is faster to execute overall.
Here is a simpler standalone example (without kernels), reading from the file in https://github.com/cosmo-epfl/librascal/issues/324#issuecomment-802711381
import os
import psutil
from rascal.representations import SphericalInvariants
import ase
from ase import io
def format_mem(nbytes):
mem_mb = nbytes / 1024 / 1024
if mem_mb > 1024:
return f"{mem_mb / 1024:.4} GiB"
else:
return f"{mem_mb:.4} MiB"
frames = ase.io.read("structures.xyz", ":")
print("memory before: ", format_mem(psutil.Process(os.getpid()).memory_info().rss))
soap = SphericalInvariants(
soap_type="PowerSpectrum",
interaction_cutoff=3.5,
max_radial=6,
max_angular=6,
gaussian_sigma_constant=0.3,
gaussian_sigma_type="Constant",
cutoff_smooth_width=0.5,
radial_basis="GTO",
normalize=True,
compute_gradients=True,
)
managers = soap.transform(frames)
print("memory used:", format_mem(psutil.Process(os.getpid()).memory_info().rss))
print(" including features:", format_mem(managers.get_features(soap).nbytes))
output:
memory before: 51.01 MiB
memory used: 3.452 GiB
including features: 8.344 MiB
That's with 10 frames, containing 434 atoms in total.
Wow, that's impressive. Leak somewhere?
On Fri, 16 Apr 2021 at 17:13, Guillaume Fraux @.***> wrote:
Here is a simpler standalone example (without kernels), reading from the file in #324 (comment) https://github.com/cosmo-epfl/librascal/issues/324#issuecomment-802711381
import osimport psutilfrom rascal.representations import SphericalInvariants import asefrom ase import io
def format_mem(nbytes): mem_mb = nbytes / 1024 / 1024 if mem_mb > 1024: return f"{mem_mb / 1024:.4} GiB" else: return f"{mem_mb:.4} MiB"
frames = ase.io.read("structures.xyz", ":") print("memory before: ", format_mem(psutil.Process(os.getpid()).memory_info().rss)) soap = SphericalInvariants( soap_type="PowerSpectrum", interaction_cutoff=3.5, max_radial=6, max_angular=6, gaussian_sigma_constant=0.3, gaussian_sigma_type="Constant", cutoff_smooth_width=0.5, radial_basis="GTO", normalize=True, compute_gradients=True, ) managers = soap.transform(frames) print("memory used:", format_mem(psutil.Process(os.getpid()).memory_info().rss)) print(" including features:", format_mem(managers.get_features(soap).nbytes))
output:
memory before: 51.01 MiB memory used: 3.452 GiB including features: 8.344 MiB
That's with 10 frames, containing 434 atoms in total.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cosmo-epfl/librascal/issues/324#issuecomment-821246856, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIREZY6IXXVIN4NFICEXYDTJBHZTANCNFSM4ZOO3UCQ .
That, or we keep stuff around that is no longer needed (I'm thinking about the cell list, I've seen issues in other software where building the neighbor list blows up memory). I'll try running this example with massif to see if I can get a profile & more information about the origin of allocations.
Here is a massif profile: massif.out.4090.txt
And the (edited) output of ms_print --threshold=10 massif.out.4090
99.41% (3,658,807,544B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->96.84% (3,564,014,336B) 0x3415D972: Eigen::DenseStorage<double, -1, -1, 1, 0>::resize(long, long, long) [clone .isra.508] (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| ->86.18% (3,171,934,400B) 0x341B4D52: void rascal::CalculatorSphericalInvariants::initialize_per_center_powerspectrum_soap_vectors<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, rascal::BlockSparseProperty<double, 1ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >, rascal::BlockSparseProperty<double, 2ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >, rascal::BlockSparseProperty<double, 1ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > > >(rascal::BlockSparseProperty<double, 1ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >&, rascal::BlockSparseProperty<double, 2ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >&, rascal::BlockSparseProperty<double, 1ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >&, std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->86.18% (3,171,934,400B) 0x341B580F: void rascal::CalculatorSphericalInvariants::compute_impl<(rascal::internal::SphericalInvariantsType)1, 0, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->86.18% (3,171,934,400B) 0x341B7690: void rascal::CalculatorSphericalInvariants::compute<rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict> >(rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->86.18% (3,171,934,400B) 0x3416CEBD: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<void, rascal::CalculatorSphericalInvariants, rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(void (rascal::CalculatorSphericalInvariants::*)(rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(rascal::CalculatorSphericalInvariants*, rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&)
| | ->86.18% (3,171,934,400B) 0x340B22C6: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| |
| ->10.30% (379,246,208B) 0x341953ED: void rascal::BlockSparseProperty<double, 2ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >::resize<std::vector, std::allocator<std::set<std::vector<int, std::allocator<int> >, std::less<std::vector<int, std::allocator<int> > >, std::allocator<std::vector<int, std::allocator<int> > > > >, std::vector<int, std::allocator<int> >, std::less<std::vector<int, std::allocator<int> > >, std::allocator<std::vector<int, std::allocator<int> > > >(std::vector<std::set<std::vector<int, std::allocator<int> >, std::less<std::vector<int, std::allocator<int> > >, std::allocator<std::vector<int, std::allocator<int> > > >, std::allocator<std::set<std::vector<int, std::allocator<int> >, std::less<std::vector<int, std::allocator<int> > >, std::allocator<std::vector<int, std::allocator<int> > > > > > const&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341962A3: void rascal::CalculatorSphericalExpansion::initialize_expansion_environment_wise<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >&, rascal::BlockSparseProperty<double, 1ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >&, rascal::BlockSparseProperty<double, 2ul, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > >, std::vector<int, std::allocator<int> > >&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341A9886: void rascal::CalculatorSphericalExpansion::compute_impl<(rascal::internal::CutoffFunctionType)0, (rascal::internal::RadialBasisType)0, (rascal::internal::AtomicSmearingType)0, (rascal::internal::OptimizationType)0, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341AC484: void rascal::CalculatorSphericalExpansion::compute_by_radial_contribution<(rascal::internal::CutoffFunctionType)0, std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341AFAD3: void rascal::CalculatorSphericalExpansion::compute<std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341B554D: void rascal::CalculatorSphericalInvariants::compute_impl<(rascal::internal::SphericalInvariantsType)1, 0, rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >(std::shared_ptr<rascal::AdaptorStrict<rascal::AdaptorCenterContribution<rascal::AdaptorNeighbourList<rascal::StructureManagerCenters> > > >) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x341B7690: void rascal::CalculatorSphericalInvariants::compute<rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict> >(rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&) (in /local/scratch/fraux/local/lib/python3.6/site-packages/rascal/lib/_rascal.cpython-36m-x86_64-linux-gnu.so)
| | ->10.30% (379,246,208B) 0x3416CEBD: void pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<void, rascal::CalculatorSphericalInvariants, rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::call_guard<pybind11::gil_scoped_release> >(void (rascal::CalculatorSphericalInvariants::*)(rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::call_guard<pybind11::gil_scoped_release> const&)::{lambda(rascal::CalculatorSphericalInvariants*, rascal::ManagerCollection<rascal::StructureManagerCenters, rascal::AdaptorNeighbourList, rascal::AdaptorCenterContribution, rascal::AdaptorStrict>&)
Most memory is allocated by rascal::CalculatorSphericalInvariants::initialize_per_center_powerspectrum_soap_vectors
(3171934400 B, or 2.5 GiB), with the next contributor being CalculatorSphericalExpansion::initialize_expansion_environment_wise
(379246208 B or 360 MiB)
Interesting, that's what actually allocates the memory for the SOAP power spectrum (and gradients IIRC). Are you sure you're correctly accounting for the memory required by the features?
If you're sure, then this could point to a bug in the allocation routines (allocating too much memory...?)
Are you sure you're correctly accounting for the memory required by the features?
Features are 8.3MiB for 434 atoms. Considering 20 neighbors per atom, the gradients need 3 x 20 x 8.3MiB for storage, which is 498 MiB.
Running code like this
n_atoms = 434
neighbors = [set() for _ in range(n_atoms)]
for atom, neighbor in managers.get_gradients_info()[:, 1:3]:
neighbors[atom].add(neighbor)
print(sum(len(n) for n in neighbors) / n_atoms, "neighbors in average")
Gives me 43.77880184331797 neighbors in average
, which should end up to 1GiB of memory.
I appreciate a second look at this, I might be overlooking something!
Hmm, do you have a way of getting info on the size of the gradients entries themselves? There might be some complications with species cross terms that could have you ending up with more gradients entries than just the 3 * n_neigh * n_atoms * n_max**2 * (l_max + 1) * (n_species * (n_species + 1)) / 2 terms that your calculation above suggests.
The easy way to check this would be to try a single-species system, where you have no species cross terms, and see if your estimate is more accurate.
Continuing debugging, it looks like there are duplicated entries in the gradients:
import ase
from rascal.representations import SphericalInvariants
frames = [
ase.Atoms(
"CC",
cell=[4.0, 4.0, 4.0],
pbc=[True, True, True],
positions=[
[0.68081000, 3.08633000, 0.58394200],
[0.07090640, 2.64372000, 0.14372900],
],
)
]
for frame in frames:
frame.wrap(eps=1e-10)
soap = SphericalInvariants(
soap_type="PowerSpectrum",
interaction_cutoff=3.5,
max_radial=6,
max_angular=6,
gaussian_sigma_constant=0.3,
gaussian_sigma_type="Constant",
cutoff_smooth_width=0.5,
radial_basis="GTO",
normalize=False,
compute_gradients=True,
)
managers = soap.transform(frames)
print(managers.get_gradients_info())
Outputs
# columns are: structure, atom, neighbor, species_atom, species_neighbor
[[0 0 0 6 6]
[0 0 1 6 6]
[0 0 1 6 6]
[0 1 1 6 6]
[0 1 0 6 6]
[0 1 0 6 6]]
Notice how atom 0 appears twice as a neighbor of atom 1, and atom 1 appears twice as a neighbor of atom 0. Since we are using reduction (i.e. sum over neighbors) most of the time, I can see how this could work be working fine when computing kernels but use more memory than needed.
It looks like a lot of memory usage can be attributed to SOAP vectors normalization. Running the script from https://github.com/cosmo-epfl/librascal/issues/324#issuecomment-821246856 with normalize = True
gives
memory before: 56.79 MiB
memory used: 3.46 GiB
but running it with normalize = False
gives
memory before: 56.75 MiB
memory used: 1.669 GiB
So around 1.5 GiB of additional memory use when normalizing SOAP vectors. That's for a feature matrix with 437 rows/atoms and a gradient matrix with 157338 row/neighbours.
Ok, this is starting to make me suspicious of this function: https://github.com/cosmo-epfl/librascal/blob/41896982bd0a64945f0609ada6fa6e12ef79baf0/src/rascal/representations/calculator_spherical_invariants.hh#L588-L593
Does the extra memory usage (when including normalization) only happen when computing gradients? And it's only for SOAP (SphericalInvariants, not SphericalExpansion) that you're seeing this extra memory usage, right? If so, then this update_gradients_for_normalization
thing would be the next place to look.
Does the extra memory usage (when including normalization) only happen when computing gradients?
Yes. Without gradients, the used memory increases by a couple dozen of kilobytes only when normalizing, as I would expect.
And it's only for SOAP (SphericalInvariants, not SphericalExpansion) that you're seeing this extra memory usage, right?
For the SphericalInvariants, I only tested the power spectrum.
The spherical expansion also have some strange behavior. On the same dataset/hyper parameters, librascal uses 30MB when only computing the values of the spherical expansion, and 400 MB when doing both values and gradients. Rascaline uses the same 30MB when computing the values; but only 200MB for the gradients (out of these, 4MB are used to store the values, and 136MB for the gradients -- with sparse species storage).
So librascal uses twice as much memory when doing the gradients. There might be something fishy here (either rascaline doing something wrong or librascal overallocating), but it is much less of an issue overall.
Hello,
was able to pin point it to this line https://github.com/cosmo-epfl/librascal/blob/db2e2445d34c196c94731249061740123f9fbc28/src/rascal/representations/calculator_spherical_invariants.hh#L1373
Later when the gradients are resized using the key_list_grad
the memory starts to differ between normalize True and False
https://github.com/cosmo-epfl/librascal/blob/db2e2445d34c196c94731249061740123f9fbc28/src/rascal/representations/calculator_spherical_invariants.hh#L1393
It uses the the less sparse key pair_list
instead of pair_list_grad
when normalizing (less sparse for multiple species). My guess is that it is there to do more conveniently operations in update_gradients_for_normalization
(the function Max posted) with soap_vector_N
, but I don't understand this part of the code well.
The remaining memory difference between normalize=True and False (which is minimal in comparison to the effect of the above) is at least partially because of the storage of the normalization coefficients https://github.com/cosmo-epfl/librascal/blob/db2e2445d34c196c94731249061740123f9fbc28/src/rascal/representations/calculator_spherical_invariants.hh#L743-L744 I don't think we free that memory, but this is just a guess and again this not really significant.