rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

faster rmsd calculations in rdkit

Open UnixJunkie opened this issue 7 years ago • 9 comments

I think Douglas Theobald's method is the fastest on earth (QCP). We should use that. He has some code in C++ or C, I don't remember. I can send it if needed.

UnixJunkie avatar Jun 15 '17 08:06 UnixJunkie

Code and further information is here http://theobald.brandeis.edu/qcp/

gedeck avatar Jun 15 '17 11:06 gedeck

I implemented the QCP code from their paper for OEChem a while back, it is substantially faster and I could do it again as well.

The downside is that it quite often doesn't achieve 0 rmsd for the same conformations but some quite small number < 10e-6. We deemed this sufficient but was kind of annoying.

There are two versions, one that computes the RMSD without the transform and one that computes the RMSD and the transform. The latter is more complicated.

Cheers, Brian

On Thu, Jun 15, 2017 at 7:52 AM, gedeck [email protected] wrote:

Code and further information is here http://theobald.brandeis.edu/qcp/

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rdkit/rdkit/issues/1460#issuecomment-308708734, or mute the thread https://github.com/notifications/unsubscribe-auth/AJbioEwS3rGbRyBpgz_1DeOdVWgInidGks5sERsRgaJpZM4N65mB .

bp-kelley avatar Jun 15 '17 16:06 bp-kelley

I've had stability issues with the single precision variant of the QCP algorithm before. I've actually implemented a GPU version of this back when I was in grad school for clustering purposes, which typically requires either all-against-one or all-against-all types of RMSD calculations.

In particular, I even recall writing a comment about the determinant calculation (via Laplacian expansion) that was particularly prone to floating point imprecisions:

https://github.com/rmcgibbo/GPURMSD/blob/master/gpurmsd/kernel_rmsd.cu#L302

proteneer avatar Jun 15 '17 21:06 proteneer

That is very interesting. Our implementation didn't consider the issues of clustering, but I do recall that in practice, it was the small rmsds that had the issues. We thought about setting a cutoff for using a slower version for really close rmsds but never implemented it.


Brian Kelley

On Jun 15, 2017, at 10:42 PM, Yutong Zhao [email protected] wrote:

I've had stability issues with the single precision variant of the QCP algorithm before. I've actually implemented a GPU version of this back when I was in grad school for clustering purposes, which typically requires either all-against-one or all-against-all types of RMSD calculations.

In particular, I even recall writing a comment about the determinant calculation (via Laplacian expansion) that was particularly prone to floating point imprecisions:

https://github.com/rmcgibbo/GPURMSD/blob/master/gpurmsd/kernel_rmsd.cu#L302

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

bp-kelley avatar Jun 15 '17 22:06 bp-kelley

We could use it for the BestRMSD code. We can always follow up with a more precise approach once the 'best' alignment is found.

gedeck avatar Jun 15 '17 22:06 gedeck

Here is some code that I have used in production: https://github.com/UnixJunkie/durandal_qcp/blob/master/src/qcprot.cc look for rmsd_without_rotation_matrix. I think that's what most people want. When you want to superpose molecules, that's another business.

UnixJunkie avatar Jun 16 '17 00:06 UnixJunkie

@bp-kelley

Interestingly enough RMSD actually satisfies requirements of a metric (the only non-trivial part is the triangle inequality requirement), for a rigorous math proof, see:

http://scripts.iucr.org/cgi-bin/paper?S0108767302011637

So having an optimized variant would be quite nice!

Note that @ihaque has also written an insanely optimize CPU rmsd somewhere, maybe he can point us to the location?

proteneer avatar Jun 16 '17 14:06 proteneer

Yup, here you go: https://github.com/pandegroup/IRMSD

ihaque avatar Jun 16 '17 15:06 ihaque

How to set RDKIT to use Hungarian Algorithm to calculate RMSD?

Dadiao-shuai avatar Dec 22 '23 03:12 Dadiao-shuai