rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

rdShapeAlign is sensitive to starting conformation

Open priley-vv opened this issue 8 months ago • 35 comments

Describe the bug rdShapeAlign.AlignMol produces significantly different output based on the rigid body transformation of the input.

To Reproduce I have attached an ipython notebook which illustrates the issue by aligning omeprazole back to itself and showing success only 75% of the time

Expected behavior Alignment should be robust to rigid body transforms.

Screenshots n/a

Configuration (please complete the following information):

  • RDKit version: rdkit==2024.9.6
  • OS: macOS Sequoia 15.4.1
  • Python version : 3.12
  • Are you using conda? Yes
  • If you are using conda, which channel did you install the rdkit from? Via pip install dependency listed as "rdkit>=2024.9.5"
  • If you are not using conda: how did you install the RDKit? n/a

Additional context This obviously could be a bug in thr underlying library or the wrapping code in rdkit. I have filed bug https://github.com/ncbi/pubchem-align3d/issues/2 at pubchem-align3d

2025-05-08 pubchem align bug.ipynb.gz

priley-vv avatar May 08 '25 12:05 priley-vv

This issue was marked as stale because it has been open for 90 days with no activity.

github-actions[bot] avatar Aug 07 '25 03:08 github-actions[bot]

Is there any developer familiar with this that can point me in the right direction?

priley-vv avatar Aug 07 '25 11:08 priley-vv

I expect the issue is that the initial transformation into the inertial frame is the problem. Either that, or the line minimizer in the optimizer has some chaotic behavior.

The code for the inertial frame is here:

  // Determine rotation necessary to put into inertial frame of reference
  VStericFrameRotation( m, coord, n, ev );

It's a little suspicious that the determination of the inertial frame happens BEFORE the translation to COM in VTransform2StericFrame, but that's just from 30 seconds of looking at the code.

bp-kelley avatar Aug 07 '25 11:08 bp-kelley

I'll note that this code came from a contribution from PubChem, the original authors are:

Authors:  Evan Bolton, Leonid Zaslavsky, Paul Thiessen

We might try and ping them.

bp-kelley avatar Aug 07 '25 11:08 bp-kelley

I filed an issue on their repo, but haven't heard anything. https://github.com/ncbi/pubchem-align3d/issues/2 Do you have other contact info?

priley-vv avatar Aug 07 '25 12:08 priley-vv

@greglandrum might, he's back next week.

bp-kelley avatar Aug 07 '25 14:08 bp-kelley

I’ve been seeing some odd results myself lately. It’s on my list to look at tomorrow. I’ll add your example to the tests in case we’re seeing the same thing. If the issue is in the optimisation then I probably won’t be able to do anything, but if it’s in the initialisation that may be doable.

DavidACosgrove avatar Aug 08 '25 07:08 DavidACosgrove

@priley-vv @bp-kelley There definitely seems to be something odd going on. I have adapted @priley-vv's notebook and have come up with this result:

ref = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(4.7213,2.5182,1.5137;4.4043,1.6113,1.0079;3.0758,1.3692,1.0722;2.6466,0.2489,0.456;1.1692,-0.0049,0.5379;0.9382,-1.073,0.5858;0.7767,0.4398,1.4605;0.2799,0.7535,-0.8523;0.4072,2.2559,-0.6946;-1.397,0.2739,-0.4568;-1.7125,-0.8269,0.188;-3.085,-0.8048,0.282;-3.965,-1.7284,0.8713;-3.5932,-2.6284,1.3503;-5.3395,-1.4631,0.8279;-6.029,-2.1717,1.2809;-5.8306,-0.3083,0.2142;-7.1752,-0.0818,0.1901;-7.6184,1.1125,-0.4496;-7.2495,2.0066,0.0645;-8.7108,1.1318,-0.3763;-7.3713,1.1144,-1.5167;-4.9691,0.6219,-0.377;-5.2846,1.5336,-0.8669;-3.6042,0.3371,-0.3218;-2.5022,1.0085,-0.7856;-2.5082,1.8901,-1.2814;3.4644,-0.6385,-0.2174;2.9174,-1.8598,-0.8757;2.0477,-1.6254,-1.4971;3.6371,-2.3342,-1.5505;2.6348,-2.6025,-0.1225;4.825,-0.3553,-0.2618;5.6802,-1.1956,-0.9111;6.2567,-2.2506,-0.1449;6.8888,-2.8477,-0.8081;5.48,-2.8993,0.272;6.879,-1.8507,0.6611;5.312,0.789,0.3602;6.7599,1.1362,0.3412;6.9177,2.1983,0.5593;7.3008,0.5521,1.0919;7.1989,0.9483,-0.6443),wU:7.6|")
probe = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(-1.32153,-4.97222,-2.11042;-1.11383,-4.52338,-1.14388;-0.63428,-3.26081,-1.20402;-0.362868,-2.6725,-0.0211404;0.169454,-1.27136,-0.106371;0.884907,-1.05475,0.692192;0.711736,-1.14765,-1.05169;-1.16791,-0.0427214,-0.0819799;-1.95615,-0.218445,-1.36497;-0.215269,1.46666,-0.194606;1.00287,1.60246,0.279491;1.35968,2.89747,-0.0183527;2.55231,3.59036,0.250442;3.37463,3.1085,0.769368;2.65944,4.92299,-0.166852;3.57823,5.46815,0.0365025;1.60806,5.55453,-0.835526;1.74659,6.85341,-1.22739;0.641362,7.44566,-1.90542;0.436447,6.9447,-2.8577;0.92317,8.47678,-2.143;-0.244791,7.49809,-1.26374;0.414062,4.87992,-1.11164;-0.435794,5.31049,-1.62457;0.329678,3.55345,-0.686829;-0.66478,2.61454,-0.785806;-1.5684,2.7491,-1.21989;-0.542453,-3.2702,1.21184;-0.218994,-2.55639,2.4807;-0.653232,-1.5521,2.49812;-0.621383,-3.06215,3.36425;0.865897,-2.4842,2.60944;-1.0376,-4.56935,1.23282;-1.23419,-5.20992,2.42029;-0.141111,-5.9658,2.93629;-0.443364,-6.38958,3.89804;0.733158,-5.32733,3.09716;0.117383,-6.78659,2.26079;-1.3311,-5.21502,0.0368067;-1.86296,-6.60553,0.0061196;-2.35176,-6.824,-0.949845;-1.05123,-7.32618,0.144158;-2.61535,-6.75864,0.786955),wD:7.6|")
shape, color = rdShapeAlign.AlignMol(
        ref=ref, 
        probe=probe, 
        opt_param=0.5, 
        max_preiters=10,
        max_postiters=30)
print(f"{shape=:.2f} {color=:.2f}")

Which gives

shape=0.94 color=0.92

And the overlay looks like:

Image

If I then feed the first result back into the overlay, so:

ref = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(4.7213,2.5182,1.5137;4.4043,1.6113,1.0079;3.0758,1.3692,1.0722;2.6466,0.2489,0.456;1.1692,-0.0049,0.5379;0.9382,-1.073,0.5858;0.7767,0.4398,1.4605;0.2799,0.7535,-0.8523;0.4072,2.2559,-0.6946;-1.397,0.2739,-0.4568;-1.7125,-0.8269,0.188;-3.085,-0.8048,0.282;-3.965,-1.7284,0.8713;-3.5932,-2.6284,1.3503;-5.3395,-1.4631,0.8279;-6.029,-2.1717,1.2809;-5.8306,-0.3083,0.2142;-7.1752,-0.0818,0.1901;-7.6184,1.1125,-0.4496;-7.2495,2.0066,0.0645;-8.7108,1.1318,-0.3763;-7.3713,1.1144,-1.5167;-4.9691,0.6219,-0.377;-5.2846,1.5336,-0.8669;-3.6042,0.3371,-0.3218;-2.5022,1.0085,-0.7856;-2.5082,1.8901,-1.2814;3.4644,-0.6385,-0.2174;2.9174,-1.8598,-0.8757;2.0477,-1.6254,-1.4971;3.6371,-2.3342,-1.5505;2.6348,-2.6025,-0.1225;4.825,-0.3553,-0.2618;5.6802,-1.1956,-0.9111;6.2567,-2.2506,-0.1449;6.8888,-2.8477,-0.8081;5.48,-2.8993,0.272;6.879,-1.8507,0.6611;5.312,0.789,0.3602;6.7599,1.1362,0.3412;6.9177,2.1983,0.5593;7.3008,0.5521,1.0919;7.1989,0.9483,-0.6443),wU:7.6|")
probe = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(-1.32153,-4.97222,-2.11042;-1.11383,-4.52338,-1.14388;-0.63428,-3.26081,-1.20402;-0.362868,-2.6725,-0.0211404;0.169454,-1.27136,-0.106371;0.884907,-1.05475,0.692192;0.711736,-1.14765,-1.05169;-1.16791,-0.0427214,-0.0819799;-1.95615,-0.218445,-1.36497;-0.215269,1.46666,-0.194606;1.00287,1.60246,0.279491;1.35968,2.89747,-0.0183527;2.55231,3.59036,0.250442;3.37463,3.1085,0.769368;2.65944,4.92299,-0.166852;3.57823,5.46815,0.0365025;1.60806,5.55453,-0.835526;1.74659,6.85341,-1.22739;0.641362,7.44566,-1.90542;0.436447,6.9447,-2.8577;0.92317,8.47678,-2.143;-0.244791,7.49809,-1.26374;0.414062,4.87992,-1.11164;-0.435794,5.31049,-1.62457;0.329678,3.55345,-0.686829;-0.66478,2.61454,-0.785806;-1.5684,2.7491,-1.21989;-0.542453,-3.2702,1.21184;-0.218994,-2.55639,2.4807;-0.653232,-1.5521,2.49812;-0.621383,-3.06215,3.36425;0.865897,-2.4842,2.60944;-1.0376,-4.56935,1.23282;-1.23419,-5.20992,2.42029;-0.141111,-5.9658,2.93629;-0.443364,-6.38958,3.89804;0.733158,-5.32733,3.09716;0.117383,-6.78659,2.26079;-1.3311,-5.21502,0.0368067;-1.86296,-6.60553,0.0061196;-2.35176,-6.824,-0.949845;-1.05123,-7.32618,0.144158;-2.61535,-6.75864,0.786955),wD:7.6|")
shape, color = rdShapeAlign.AlignMol(
        ref=ref, 
        probe=probe, 
        opt_param=0.5, 
        max_preiters=10,
        max_postiters=30)
print(f"{shape=:.2f} {color=:.2f}")
shape, color = rdShapeAlign.AlignMol(
        ref=ref, 
        probe=probe, 
        opt_param=0.5, 
        max_preiters=10,
        max_postiters=30)
print(f"{shape=:.2f} {color=:.2f}")

I get:

shape=0.94 color=0.92
shape=0.98 color=0.62

and the overlay is

Image

The second overlay is worse than first, which is the opposite of what one would expect.

I will descend into the rabbit hole...

DavidACosgrove avatar Aug 09 '25 09:08 DavidACosgrove

@bp-kelley @priley-vv I have spent a torrid but possibly fruitful time in the weeds. It has taken all day :-(.

I have found at least 1 error, that, when fixed, has caused a single test case to, I think, behave correctly. In shape_neighbor.cpp, which is part of the pubchem-shape download, on line 688, it copies old_quattrans into qom because it hasn't made a good step. It does it with:

memcpy( qom, old_quattrans, 7 * sizeof( float ) );

qom and old_quatrans are both doubles.

If you look at line 678, as an example, it does it correctly:

 memcpy( qom, old_quattrans, 7 * sizeof( double ) );

One upshot of the line 688 copy is that the colour Tanimoto is computed incorrectly when being saved as the best answer when the optimiser makes a bad step because it uses a quaternion which is an interesting mix of 2.

I will spend some time on Monday seeing if this is the full solution to the errors we have been seeing, but for now I really need a beer!

@greglandrum Since this is in the pubchem code, what's the method of getting it updated?

DavidACosgrove avatar Aug 09 '25 17:08 DavidACosgrove

@DavidACosgrove next time we’re in the same city I’ll buy you one.

bp-kelley avatar Aug 10 '25 00:08 bp-kelley

@bp-kelley @priley-vv @greglandrum I have re-run Pat's original notebook with the fixed code, and it's better, I think, but only marginally. These are his results for successively more iterations for the optimisation

With     1 times more max_{pre,post}iters,  70/100 successfully aligned
With    10 times more max_{pre,post}iters,  77/100 successfully aligned
With   100 times more max_{pre,post}iters,  77/100 successfully aligned
With  1000 times more max_{pre,post}iters,  66/100 successfully aligned
With 10000 times more max_{pre,post}iters,  69/100 successfully aligned

And this is a sample of what I get now

With     1 times more max_{pre,post}iters,  77/100 successfully aligned
With    10 times more max_{pre,post}iters,  84/100 successfully aligned
With   100 times more max_{pre,post}iters,  92/100 successfully aligned
With  1000 times more max_{pre,post}iters,  88/100 successfully aligned
With 10000 times more max_{pre,post}iters,  91/100 successfully aligned

There are still some pretty strange dependencies on the input geometry. For example Taking

probe_om = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(3.77371,4.00475,-0.802776;2.89633,3.5975,-1.29588;2.43356,2.44762,-0.756167;1.34508,1.91185,-1.34535;0.84519,0.635076,-0.733974;0.418899,-0.0391185,-1.48246;1.68621,0.0999027,-0.276495;-0.374918,0.949996,0.574018;0.356982,1.64186,1.70712;-0.749127,-0.726914,1.07037;-0.694496,-1.76101,0.261227;-1.04831,-2.84072,1.03721;-1.15937,-4.19969,0.697255;-0.953028,-4.54286,-0.311374;-1.54403,-5.10791,1.69158;-1.63311,-6.16188,1.4387;-1.81254,-4.67871,2.99352;-2.185,-5.59053,3.93676;-2.44613,-5.09501,5.24758;-1.54944,-4.65785,5.69996;-2.72842,-5.95289,5.8668;-3.29566,-4.40364,5.25594;-1.70669,-3.32999,3.34951;-1.90001,-2.9336,4.33756;-1.32202,-2.44332,2.34305;-1.12265,-1.08667,2.33539;-1.2321,-0.465194,3.12587;0.695801,2.45645,-2.43695;-0.504672,1.8058,-3.03667;-1.24128,1.54036,-2.2722;-1.03481,2.46029,-3.73594;-0.21304,0.90522,-3.58683;1.20143,3.63972,-2.96386;0.602598,4.22243,-4.04132;1.0684,3.83093,-5.33058;0.474862,4.35859,-6.08237;0.941711,2.75422,-5.48067;2.11949,4.10356,-5.46364;2.3225,4.22747,-2.38834;2.90207,5.49377,-2.91551;3.53434,5.98432,-2.1671;3.51626,5.29218,-3.79836;2.11583,6.20834,-3.18095),wD:7.6|")

And doing the same as I did above: an overlay, then feeding the result into a 2nd overlay, I get scores of:

1  shape1=0.76 color1=0.16
2  shape2=1.00 color2=1.00
Image

In the picture, the reference is cyan, the probe conformation is blue and the result is in red. You can see that the 6,5 ring on the left is 180 degrees opposed in the overlay. However, if I feed that overlay back in, it gives a perfect result on the second go.

However, I have a 2nd input conformation where the scores for an overlay followed by another are:

probe_om = Chem.MolFromSmiles("[H]c1nc(C([H])([H])[S@](=O)c2nc3c([H])c([H])c(OC([H])([H])[H])c([H])c3n2[H])c(C([H])([H])[H])c(OC([H])([H])[H])c1C([H])([H])[H] |(1.36605,-3.70605,3.91438;1.4319,-3.59359,2.83651;0.744765,-2.54042,2.34018;0.802561,-2.3714,1.00337;0.0334414,-1.19796,0.469201;-0.385129,-1.39804,-0.521384;-0.817125,-0.9921,1.13038;1.05374,0.30352,0.412393;1.36315,0.684986,1.84663;-0.130986,1.46828,-0.249565;-1.11084,1.13728,-1.06037;-1.77366,2.31494,-1.31893;-2.90084,2.55958,-2.12164;-3.39081,1.75391,-2.65876;-3.38224,3.87143,-2.21556;-4.25393,4.07139,-2.83436;-2.76035,4.91718,-1.52925;-3.25577,6.1826,-1.64366;-2.58059,7.20917,-0.920975;-2.63475,7.04401,0.160501;-3.10639,8.1476,-1.12582;-1.55081,7.33879,-1.27101;-1.63772,4.69171,-0.725595;-1.11118,5.45592,-0.169283;-1.17572,3.37739,-0.64681;-0.127889,2.80679,0.0290277;0.52834,3.29139,0.627019;1.50408,-3.18992,0.138697;1.52159,-2.94084,-1.33161;1.74837,-1.89469,-1.55876;2.28995,-3.52365,-1.84962;0.555299,-3.20605,-1.77287;2.19843,-4.26271,0.686779;2.90736,-5.10467,-0.117947;2.21602,-6.23827,-0.637022;2.90488,-6.79265,-1.28048;1.35387,-5.92872,-1.23599;1.88888,-6.89822,0.171843;2.16659,-4.47546,2.06053;2.89299,-5.60963,2.69561;3.04846,-5.43372,3.76585;2.32252,-6.53666,2.58393;3.88479,-5.74288,2.25086),wU:7.6|")
1  shape1=0.95 color1=0.83
2  shape2=0.95 color2=0.85

In this case, the first result is essentially correct but doesn't go the last steps to a perfect score, and restarting doesn't do much at all which is inconsistent with the first result.

Image

In the picture, the reference and probe are again cyan and blue, the first result is red and the second is purple.

The starting points for the optimisations in each case are rotations of the original input structure. So for the first pair of overlays, they are:

Image

and

Image

Which certainly explains why different input conformations can give different solutions. @bp-kelley my vague recollection from discussions with Andy Grant at the time is that in ROCS the initial positions are chosen based on the shape multipoles. Is that correct? I would certainly expect that given the same input conformation the optimiser would start from the same place each time. I might also include perhaps just sticking with the input as is, so that it could be used to refine an input pose.

I think I've reached the end of what I can do for now. It would be great if we could re-engage the PubChem folk to take a look at it.

DavidACosgrove avatar Aug 11 '25 07:08 DavidACosgrove

@DavidACosgrove For most symmetries of molecules, ROCS uses four intertial starts and takes the best optimized results. So one starts with the intertial frame (first and second moments of intertia), and then reflects the axes. This helps with the ring symmetry and color problems.

bp-kelley avatar Aug 11 '25 12:08 bp-kelley

I think if we can do overlays with no optimization (i.e. get the initial pose) we might be able to see what is going on.

bp-kelley avatar Aug 11 '25 12:08 bp-kelley

@bp-kelley The two pictures immediately above are the 4 initial poses the optimiser uses in one of the pairs of overlays above, where I have taken a random orientation of the probe molecule, overlaid it, then taken the result of that and overlaid it again. I think that is what you are asking for.

DavidACosgrove avatar Aug 11 '25 12:08 DavidACosgrove

@bp-kelley Do you mean that the components of the moments of inertia are aligned with the cartesian axes? That doesn't seem to be happening.

DavidACosgrove avatar Aug 11 '25 12:08 DavidACosgrove

Amazing debugging @DavidACosgrove. My understanding of the intention of the algorithm is that no matter the initial pose of the molecules, the starting positions for optimization should be exactly the same. Maybe we should check the calculation of the inertial reference frame?

priley-vv avatar Aug 11 '25 12:08 priley-vv

Yes, the largest eigenvector of the mass diagonalization matrix => x axis, then y then z. I would start with the inertial reference frame, it should generate the same poses regardless of the input orientation (except for symmetry issues)

bp-kelley avatar Aug 11 '25 12:08 bp-kelley

@bp-kelley : for symmetric (or even almost symmetric due to numeric noise), you can get flips along any of the axes. That's what you mean, correct?

priley-vv avatar Aug 11 '25 12:08 priley-vv

@bp-kelley @priley-vv It was my assumption also that it did something like that, but I don't think it is. It is does go off to TransformCoordToStericFrameAndCalculateQrat where it does seem to do some relevant calculations. I does do 4-fold reflections but seemingly of the starting configuration. I will dig in there and see what I find, but probably not today.

DavidACosgrove avatar Aug 11 '25 12:08 DavidACosgrove

@priley-vv that's partly it, the other aspect is that color is invisible to the inertial frames so symmetry is a real problem for optimization.

bp-kelley avatar Aug 11 '25 13:08 bp-kelley

I'll also point out that the target should also be in its inertial frame but isn't flipped on it's axes.

bp-kelley avatar Aug 12 '25 12:08 bp-kelley

After a bit of a journey, I can confirm that if I use the RDKit MolTransforms::canonicalizeConformer function on the reference and fit configurations to start with, it does end up with scores of 2.0. I have tried it with 2 different rotations of esomeprazole from @priley-vv's input set and got the same answer. MolTransforms::canonicalizeConformer put them different ways round on the axes (which I think might be expected) and they both came out with the same score. So I think that the key issue is that shape_neighbor() isn't doing that for the optimisation. As I mentioned above, it calculates the alignment along the axes but doesn't pass those coords into the optimiser. It also doesn't save the transformation it used, which is going to complicate matters. I think this is solvable, and I will attempt to demonstrate this :-).

DavidACosgrove avatar Aug 21 '25 15:08 DavidACosgrove

I thought I'd document here, as I haven't seen it anywhere else, that it seems to be optimising only the shape fit, and scoring the features/colours at the end. I had imagined it did it on both at the same time. I wonder how that works with very small molecules like synthons, which is clearly of interest to me.

DavidACosgrove avatar Aug 21 '25 15:08 DavidACosgrove

I thought I'd document here, as I haven't seen it anywhere else, that it seems to be optimising only the shape fit, and scoring the features/colours at the end. I had imagined it did it on both at the same time. I wonder how that works with very small molecules like synthons, which is clearly of interest to me.

This is another issue I started to work on but got caught up in trying to get my rdkit environment set up. The default value for

(float)opt_param=1.0

is poor and undocumented 0.0 means optimize color only 0.5 means equal weight to color+shape (what I think most people would expect) 1.0 means optimize shape only

I think this need to be documented and the default changed to 0.5

priley-vv avatar Aug 21 '25 15:08 priley-vv

I think this need to be documented and the default changed to 0.5

Just made this PR https://github.com/rdkit/rdkit/pull/8724 but as I say there, I haven't gotten my dev environment up to test this.

priley-vv avatar Aug 21 '25 16:08 priley-vv

I've had another look at the code. The gradients for the optimiser are only calculated for the atoms, not the features, AFAICT. The shape score is then calculated to determine whether to accept the move or not. So the features aren't actively driving the optimisation, just used in the assessment of poses as the optimisation proceeds.

DavidACosgrove avatar Aug 21 '25 16:08 DavidACosgrove

To be fair, this is also how fastrocs works, shape opt then color. For ROCs the color Gaussian are tiny in width and shape dominates the opt anyway.

bp-kelley avatar Aug 21 '25 16:08 bp-kelley

@DavidACosgrove : thanks for the pointer about doing canonicalization. I can confirm that adding this canonicalization (for reference and probe) before calling rdShapeAlign makes it much more reproducible in my internal, can't be shared, non-toy test case.

priley-vv avatar Aug 26 '25 15:08 priley-vv

@priley-vv that’s good to hear. I am hoping to make it an option in the RDKit code in the next week or so.

DavidACosgrove avatar Aug 26 '25 16:08 DavidACosgrove

I'm sure you are thinking about this, but I had to do a little bit a bookkeeping to still align back to the original configuration

        # Canonicalize here. The underlying shape matching should be insensitive to
        # this, but it may not actually be so we'll do this (it won't hurt):
        # https://github.com/rdkit/rdkit/issues/8513
        # Save off transforms so users of this class can get back to original
        # coordinate system if needed
        self.orig_to_canon_transform = Chem.rdMolTransforms.ComputeCanonicalTransform(
            self.ref.GetConformer()
        )
        self.canon_to_orig_transform = np.linalg.inv(self.orig_to_canon_transform)
        Chem.rdMolTransforms.TransformConformer(
            self.ref.GetConformer(), self.orig_to_canon_transform
        )

then after I ran the alignment

        Chem.rdMolTransforms.TransformConformer(
            probe.GetConformer(), self.canon_to_orig_transform
        )

priley-vv avatar Aug 26 '25 17:08 priley-vv