Use native scipy periodic KDTree for orthorhombic boxes?
The MDAnalysos PBC-awared distance calculations normally use a grid-based method with "augmentation" to find minimum image distances.
However, as an alternative method a KDTree based method also exists in lib.pkdtree. More recent versions of scipy have added the boxsize=[Lx, Ly, Lz] kwarg to scipy.spatial.cKDTree to natively compute nearest neighbors on a toroidal topology (i.e., orthorhombic boxes in the case of MD simulations). As recently pointed out in https://github.com/scipy/scipy/issues/16191#issuecomment-2568357454 we could use this update to scipy.
- We should investigate if the performance of the native scipy periodic KDTree is better than our implementation.
- If yes, we should consider changing https://github.com/MDAnalysis/mdanalysis/blob/5d48c5cf5f89700dde7ac5fb6a2e74458ce27a7e/package/MDAnalysis/lib/pkdtree.py#L66 so that we only use our augmented KDTree for cases that native scipy cannot handle.
This is a great suggestion for standardizing our PBC implementation! Using the native $\text{SciPy}$ feature would be excellent for code maintainability, assuming the performance is competitive.I can take on the initial performance investigation as proposed. I will set up a benchmarking script to compare our existing $\text{lib.pkdtree}$ against the native $\text{scipy.spatial.cKDTree}$ with the boxsize argument, focusing on typical orthorhombic boxes. I'll report back with the results (e.g., runtime vs. number of atoms) next week.
Hi @ChinmayRout9040895625 , are you still working on this issue? I was interested in contributing here and wanted to confirm before starting.
Hi @orbeckst ,
These results are for cutoff of 15 Angstrom , Scipy turned out to be significantly faster, Benchmark Results (MDAnalysis vs SciPy):
N=100:Tree Build (714μs vs 81μs), Query (160μs vs 14μs) N=1,000:Tree Build (6.86ms vs 0.37ms), Query (333μs vs 37μs) N=10,000: Tree Build (47.8ms vs 3.24ms), Query (2.57ms vs 242μs) N=100,000: Tree Build (1.02s vs 54.3ms), Query (21.9ms vs 2.84ms)
@Aryaman-Chaudhri these numbers are interesting. How do the scipy values depend on cutoff for a range of N when compared to capped_distance_array?
Basically my question is under which conditions we would select to use periodic KDTree.
Can you paste the code you used for benchmarking?
import MDAnalysis as mda
import numpy as np
from MDAnalysis.lib.pkdtree import PeriodicKDTree
from scipy.spatial import cKDTree
class benchmark_test_scipy():
#taken parameters of various sizes for analysis
params=([100,1000,10000,100000])
param_names = (['number_of_atoms'])
def setup(self, number_of_atoms):
self.box = np.array([170.0, 70.0, 120.0, 90.0, 90.0, 90.0] , dtype=np.float32)
self.positions = (np.random.rand(number_of_atoms, 3) * self.box[:3]).astype(np.float32)
self.centre = self.box[:3] / 2.0
self.radius = 15
self.mda_tree = PeriodicKDTree(box=self.box)
self.mda_tree.set_coords(self.positions , cutoff=self.radius)
self.scipy_tree = cKDTree(self.positions, boxsize=self.box[:3])
def time_mda_tree(self,number_of_atoms):
tree = PeriodicKDTree(box=self.box)
tree.set_coords(self.positions, cutoff=self.radius)
def time__scipy_tree(self,number_of_atoms):
cKDTree(self.positions, boxsize=self.box[:3])
def time_mda(self,number_of_atoms):
self.mda_tree.search(self.centre , self.radius)
def time_scipy(self,number_of_atoms):
self.scipy_tree.query_ball_point(self.centre , self.radius)
Hello @orbeckst , the above is the current code used for benchmarking , i will include the capped_distance_array in my code and test all 3 against different values and then come back with the answers.
Hello @orbeckst , It appears that Scipy is faster for smaller cutoff but as we increase cutoff size capped distance becomes faster , as we increase the value of N the cutoff for which capped distance becomes faster than scipy shifts leftward. for example lets say by increasing size of N cutoff where capped distance beats scipy shifts from lets say 40 to 30. We can implement a solution for which we check orthorhombic if yes then implement scipy or capped distance else use pkdtree or capped distance but i think that will require another analysis as to how can alpha , beta , gamma angles can effect the results . Give any feedback if necessary , I am also attaching the code used for this benchmarking.
#taken parameters of various sizes for analysis
params=([100,1000,10000,100000],[20,30,36,42,48,50,60])
param_names = (['number_of_atoms', 'cutoff'])
def setup(self, number_of_atoms , cutoff):
self.box = np.array([170.0, 70.0, 120.0, 90.0, 90.0, 90.0] , dtype=np.float32)
self.positions = (np.random.rand(number_of_atoms, 3) * self.box[:3]).astype(np.float32)
self.centre = (self.box[:3] / 2.0).reshape(1, 3)
self.cutoff = cutoff
self.mda_tree = PeriodicKDTree(box=self.box)
self.mda_tree.set_coords(self.positions , cutoff=self.cutoff)
self.scipy_tree = cKDTree(self.positions, boxsize=self.box[:3])
def time_mda_tree(self,number_of_atoms , cutoff):
tree = PeriodicKDTree(box=self.box)
tree.set_coords(self.positions, cutoff=self.cutoff)
def time_scipy_tree(self,number_of_atoms , cutoff ):
cKDTree(self.positions, boxsize=self.box[:3])
def time_mda(self,number_of_atoms ,cutoff):
self.mda_tree.search(self.centre , self.cutoff)
def time_scipy(self,number_of_atoms,cutoff):
self.scipy_tree.query_ball_point(self.centre , self.cutoff)
def time_capped_distance_array(self, number_of_atoms, cutoff):
capped_distance(self.centre, self.positions , max_cutoff = self.cutoff , box=self.box)
Hello @orbeckst , I ran benchmark of pdktree vs capped distance for triclinic cases too , capped distance proved to be much faster , .Here is the data . Cutoff = 20 Å: 677 ms vs 11.4 ms
Cutoff = 30 Å: 927 ms vs 13.1 ms
Cutoff = 50 Å: 902 ms vs 16.9 ms
Cutoff = 60 Å: 1.07 s vs 20.2 ms for N=100000 so capped distance is much faster than pdktree for triclinic cases as well.
Isn't implementing something like ? if orthorhombic use scipy for < 50 Angstrom use capped distance > 50 Angstrom else use capped distance.
Using 50 angstrom as thats the point around which capped distance becomes faster than scipy for N=100000.
Hi @orbeckst, I know you are busy, but could you please take a look at the benchmark results I posted above? I'd like to move forward with the PR if the performance numbers look good to you.
Summary
Thank you for all the preliminary benchmarking. Based on your work and my own quick mini-bench (see below) I'm coming to the following conclusions:
- For orthorhombic boxes, scipy's cKDTree can be 5 to 50 times faster than MDA's PeriodicKDTree. https://github.com/MDAnalysis/mdanalysis/issues/5114#issuecomment-3529297215
- For all boxes, MDA's capped_distance is faster than cKDTree (note that cKDTree cannot correctly handle general triclinic boxes). https://github.com/MDAnalysis/mdanalysis/issues/5114#issuecomment-3549333842
Thus, scipy cKDTree could replace PKDTree for orthorhombic boxes.
Does this sound right? Please correct me if I misunderstood anything.
Next steps
- Check that scipy.spatial.cKDTree and MDAnalysis.lib.pkdtree.PKDTree produce the same results for the same input. If true, continue.
- Use scipy.spatial.cKDTree instead of PKDTree for orthorhombic boxes.
- Check that the tests are still passing.
Looking at the code in lib/pkdtree.py I would suggest the following
- Rename PKDTree into AugmentedPKDTree.
- Add a new PKDTree class that has the same interface as AugmentedPKDTree but delegates to scipy.spatial.cKDTree or AugmentedPKDTree, depending on the PBC:
- no PBC or orthorhombic: scipy.spatial.cKDTree
- triclinic: AugmentedPKDTree
- update the tests in test_pkdtree so that they are ran with both PKDTree (checking that we did not break the user-facing side) and AugmentedPKDTree (checking that we did not somehow break our existing working implementation)
Mini-bench
Based on your code, I did some quick testing on my M2 mac with MDAnalysis 2.10.0 and scipy 1.16.3:
Results
| Function | N = 10,000 (10k) | N = 100,000 (100k) |
|---|---|---|
| capped_distance_array | 75.5 µs ± 4.58 µs | 647 µs ± 7.71 µs |
| MDA PKDTree | 17.1 ms ± 292 µs | 198 ms ± 1.26 ms |
| SciPy cKDTree | 1.94 ms ± 34.6 µs | 23.7 ms ± 239 µs |
In [1]: import neighbors.py
# bigger system 100k particles
In [2]: nb = distances.NeighborsBench(100_000, 40)
In [3]: %timeit nb.time_capped_distance_array()
647 μs ± 7.71 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
In [4]: %timeit nb.time_mda_PKDtree()
198 ms ± 1.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [5]: %timeit nb.time_scipy_cKDTree()
23.7 ms ± 239 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# small system 10k particles
In [6]: nb_small = distances.NeighborsBench(10_000, 40)
In [7]: %timeit nb_small.time_capped_distance_array()
75.5 μs ± 4.58 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [8]: %timeit nb_small.time_mda_PKDtree()
17.1 ms ± 292 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [9]: %timeit nb_small.time_scipy_cKDTree()
1.94 ms ± 34.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
These mini-benchmarks confirm your findings: the time to solution (setup + query) is
capped_distance < scipy cKDTree << MDA PKDTree
with the limitation that scipy cKDTree only works for orthorhombic boxes.
Code neighbors.py
# https://github.com/MDAnalysis/mdanalysis/issues/5114
# author: @Aryaman-Chaudhri (modifications @orbeckst)
import numpy as np
from MDAnalysis.lib.pkdtree import PeriodicKDTree
from MDAnalysis.lib.distances import capped_distance
from scipy.spatial import cKDTree
#taken parameters of various sizes for analysis
params=([100,1000,10000,100000],[20,30,36,42,48,50,60])
param_names = (['number_of_atoms', 'cutoff'])
### Not functional for ASV
#### just manual experimentation and timing in ipython
class NeighborsBench(object):
"""Benchmarks for neighbor searching functions."""
params = ([100,1000,10000,100000],[20,30,36,42,48,50,60])
param_names = (['number_of_atoms', 'cutoff'])
def __init__(self, number_of_atoms , cutoff):
self.setup(number_of_atoms , cutoff)
def setup(self, number_of_atoms , cutoff):
self.box = np.array([170.0, 70.0, 120.0, 90.0, 90.0, 90.0] , dtype=np.float32)
self.positions = (np.random.rand(number_of_atoms, 3) * self.box[:3]).astype(np.float32)
self.centre = (self.box[:3] / 2.0).reshape(1, 3)
self.cutoff = cutoff
def setup_scipy_cKDTree(self):
self.scipy_tree = cKDTree(self.positions, boxsize=self.box[:3])
def setup_mda_tree(self):
self.mda_tree = PeriodicKDTree(box=self.box)
self.mda_tree.set_coords(self.positions , cutoff=self.cutoff)
def time_mda_tree(self):
self.mda_tree.search(self.centre , self.cutoff)
def time_scipy_tree(self):
self.scipy_tree.query_ball_point(self.centre , self.cutoff)
def time_mda_PKDtree(self):
self.setup_mda_tree()
self.time_mda_tree()
def time_scipy_cKDTree(self):
self.setup_scipy_cKDTree()
self.time_scipy_tree()
def time_capped_distance_array(self):
capped_distance(self.centre, self.positions , max_cutoff = self.cutoff , box=self.box)