mdanalysis
mdanalysis copied to clipboard
Mismatch in groupselections for RMSD between target and reference
Expected behavior
If I pass an atomgroup to RMSD, groupselections are taken only from that AtomGroup
Actual behavior
This does not happen for one of the target or reference for groupselections, resulting in an error when the RMSD is calculated due to mismatching atoms.
Code to reproduce the behavior
Error when selections are not subsets of atomgroups
In [1]: import MDAnalysis as mda
...: from MDAnalysis.tests.datafiles import PSF, DCD, CRD
...: from MDAnalysis.analysis import rms
In [2]: u = mda.Universe(PSF, DCD)
In [3]: ca = u.select_atoms('name CA')
In [4]: CORE = 'backbone and (resid 1-29 or resid 60-121 or resid 160-214)'
In [5]: R = rms.RMSD(ca, ca, select=CORE,
...: groupselections=[CORE],
...: ref_frame=0)
...: R.run()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-76040dc15767> in <module>
2 groupselections=[CORE],
3 ref_frame=0)
----> 4 R.run()
~/anaconda3/envs/mdanalysis/lib/python3.7/site-packages/MDAnalysis/analysis/base.py in run(self, start, stop, step, verbose)
195 self.times[i] = ts.time
196 # logger.info("--> Doing frame {} of {}".format(i+1, self.n_frames))
--> 197 self._single_frame()
198 logger.info("Finishing up")
199 self._conclude()
~/anaconda3/envs/mdanalysis/lib/python3.7/site-packages/MDAnalysis/analysis/rms.py in _single_frame(self)
663 refpos, atoms['mobile'].positions,
664 weights=self.weights_groupselections[igroup-3],
--> 665 center=False, superposition=False)
666 else:
667 # only calculate RMSD by setting the Rmatrix to None (no need
~/anaconda3/envs/mdanalysis/lib/python3.7/site-packages/MDAnalysis/analysis/rms.py in rmsd(a, b, weights, center, superposition)
231 N = b.shape[0]
232 if a.shape != b.shape:
--> 233 raise ValueError('a and b must have same shape')
234
235 # superposition only works if structures are centered
ValueError: a and b must have same shape
Working when selections are entirely within the atomgroups
In [7]: bb = u.select_atoms('backbone')
In [8]: R = rms.RMSD(bb, bb, select=CORE,
...: groupselections=[CORE],
...: ref_frame=0)
...: R.run()
Out[8]: <MDAnalysis.analysis.rms.RMSD at 0x1183f5f90>
Current version of MDAnalysis
- Which version are you using? (run
python -c "import MDAnalysis as mda; print(mda.__version__)") 1.0.0 - Which version of Python (
python -V)? 3.7.3 - Which operating system?
Ok, groupselections are selected from target Universe here:
https://github.com/MDAnalysis/mdanalysis/blob/31929685389638232b8f22a2ee9cc63bd659f8d5/package/MDAnalysis/analysis/rms.py#L517-L522
But the reference groupselection is selected from the given reference AtomGroup here:
https://github.com/MDAnalysis/mdanalysis/blob/31929685389638232b8f22a2ee9cc63bd659f8d5/package/MDAnalysis/analysis/rms.py#L592-L596
It seems to me that the below check should be moved to after the reference groupselection is found, and also the original target atoms should not be selected from universes.
https://github.com/MDAnalysis/mdanalysis/blob/31929685389638232b8f22a2ee9cc63bd659f8d5/package/MDAnalysis/analysis/rms.py#L524-L533
Hi I am new to open source Can you help me how to get started ?
Hello @RAJULDEV , welcome to the MDAnalysis community. Please see https://userguide.mdanalysis.org/stable/contributing.html for how to contribute to MDAnalysis.
We recommend you work through our introductory tutorial https://userguide.mdanalysis.org/stable/examples/quickstart.html to get a feel for MDAnalysis if you haven't used it before. This issue, for example, makes a lot more sense when you understand what an RMSD calculation is and why the selections in question are important.
We have an active mailing list and discord server where you can start discussions, see https://www.mdanalysis.org/#participating for how to participate.
Hi @lilyminium , Thanks for this issue.
I think the error in this case is simply because we are trying to select something that does not exist, i.e. backbone atoms (N,C or CA) from a universe that has only CA atoms. To see, if this is indeed the case, I changed the CORE selection to 'name CA' from 'backbone', and the code works perfectly now.
System details:
- Which version of MDAnalysis I am using? 2.7.0 Which version of Python (python -V)? Python 3.10.8 Which operating system? Ubuntu 20.0.4.5 LTS on Windows Subsytem by Linux, on Jupyter Notebook ver 7.0.6
code to reproduce solution
CORE= 'name CA and (resid 1-29 or resid 60-121 or resid 160-214)' R = rms.RMSD(ca, ca, select=CORE, groupselections=[CORE], ref_frame=0) R.run() ##output <MDAnalysis.analysis.rms.RMSD at 0x7f41b1bd5690> `CORE = 'backbone and (resid 1-29 or resid 60-121 or resid 160-214)'
Code snippet from Jupyter Notebook
Okay, so : by using the selection on CA atoms which are technically a subset of backbone atoms, the selection is failing
If I redefined CORE with only CA atoms:
In[15]:
CORE = 'name CA and (resid 1-29 or resid 60-121 or resid 160-214)'
In[16]:
R = rms.RMSD(ca, ca, select=CORE, groupselections=[CORE], ref_frame=0)`
I hadn't looked at this issue in forever... The docs for groupselection say explicitly that these selections are for the whole system:
A list of selections as described for select, with the difference that these selections are always applied to the full universes, i.e.,
atomgroup.universe.select_atoms(sel1)andreference.universe.select_atoms(sel2). Each selection describes additional RMSDs to be computed after the structures have been superimposed according to select. No additional fitting is performed.The output contains one additional column for each selection.
The intended behavior was "Do a superposition on whatever is good substructure and then do the RMSD calculation on whatever else you're interested in." We use this a lot to, for instance, superimpose on a stable membrane domain and then calculate the RMSD of a moving domain. In this case the two groups have zero intersection.
If I understood the Expected Behavior
If I pass an atomgroup to RMSD, groupselections are taken only from that AtomGroup
correctly then you would effectively suggest to change the code to atomgroup.select_atoms(sel1) and reference.select_atoms(sel2) or am I misunderstanding, @lilyminium ?
If my interpretation is correct then I disagree with the Expected behavior. However, even if we keep the old behavior, we shouldn't be throwing a cryptic error.
Thanks for the clarification! I found this issue from the GSoC starter tag, and wanted to solve this as my first issue. Of course, my solution is just for this specific use case and not for the main code base. Happy to have a re-think on this!
(I am working up the courage for a formal introduction of myself in Discussions)
@SampurnaM welcome to MDAnalysis :-)
Can I proceed with this issue for my GSoc Starter PR please? If it is already solved, I will look at something else :)