rmsd icon indicating copy to clipboard operation
rmsd copied to clipboard

Why these two pdbs can not calculate RMSD?

Open Qmi3 opened this issue 2 years ago • 4 comments
trafficstars

The first file contain hydrogen but when i use --no-hydrogen and --reorder ,it's still reported that structure not same size .They are both 169 residues.

Qmi3 avatar Mar 14 '23 06:03 Qmi3

test.txt test1.txt

Qmi3 avatar Mar 14 '23 06:03 Qmi3

In short: the two data (still) differ in their account of non-hydrogen atoms.

For a check, I use openbabel (version 3.1.1) as provided by the repositories of Linux Debian bookworm (still in branch testing).

$ obabel -ipdb test.txt -O test.xyz
1 molecule converted
obabel -ipdb test1.txt -O test1.xyz
==============================
*** Open Babel Warning  in PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is test1.txt)

1 molecule converted
$ wc -l *.xyz
  1323 test1.xyz
  2677 test.xyz
  4000 total

At this point, not too surprising because - coherent to your description - one includes hydrogen atoms, the other does not:

$ head test1.xyz
1321
test1.txt
N          1.20800       33.64200       -8.60100
C          0.66000       33.38100       -9.95200
C          0.95100       31.93500      -10.34000
O          1.53100       31.20400       -9.53900
N          0.58900       31.49000      -11.56900
C          0.66200       30.06700      -11.94000
C          2.10500       29.51400      -11.94100
O          3.03700       30.27100      -12.32300
$ head test.xyz
2675
test.txt
N        -10.55800      -12.08700       10.70100
H        -10.45000      -13.03200       11.04000
H        -11.27500      -11.61400       11.23200
H         -9.67100      -11.61900       10.82100
C        -10.90000      -12.11700        9.26500
H        -11.81800      -12.68200        9.10900
H        -11.02200      -11.10200        8.88500
C         -9.77800      -12.78600        8.49900

Hence, an additional step which removes the hydrogen atoms:

$ obabel test.xyz -d -O test_a.xyz
1 molecule converted

and still

$ wc -l *.xyz
  1323 test1.xyz
  1326 test_a.xyz
  2677 test.xyz
  5326 total

which indicates a discrepancy of three (non-hydrogen) atoms. Because the structure is a larger one than typical for this tool, perhaps Jmol is an alternative to consider. Without explicit definition of a (reference) sequence shared by both structures, the RMSD of the two initial structures (your .txt .pdb files, one with, one without hydrogens) already drops from 21.84 to 1.92 Angstroms, though I speculate one could do better (see e.g., here).

The archive attached contains the relevant data of this brief test. This includes the commands sent to Jmol's console (File -> Console) for the 1:1 superposition. (Jmol can run headerless for batch conversions / computations. This however is a different question.)

2023-03-14_check.zip

nbehrnd avatar Mar 14 '23 08:03 nbehrnd

Hi @Qmi3 , interesting problem. So other than Hydrogens, one file also included the rest of the amino acids, whereas the other only included the core. The same number of amino acids, different number of atoms.

I opened up a branch https://github.com/charnley/rmsd/tree/charnley/issue-98 to try to resolve this. I have added a new argument to only look at the alpha Carbons of a protein structure. This is extremely hard to reorder though! But it seems your structures were already ordered.

Please try it out and let me know if this solves your issue.

charnley avatar Mar 27 '23 20:03 charnley

You can use Pymol for this

andersx avatar Apr 15 '23 09:04 andersx