pysoftk icon indicating copy to clipboard operation
pysoftk copied to clipboard

Failed linear polymerization with Agarose monomer and large, flexible molecules take too long

Open mattiafelice-palermo opened this issue 1 year ago • 3 comments

Dear Alejandro,

thank you very much for sharing PySoftK, I found it very useful for easily building polymers.

I have run into some difficulties with some molecules unfortunately, such as Agarose:

from pysoftk.linear_polymer.super_monomer import *
from pysoftk.linear_polymer.linear_polymer import *
from pysoftk.format_printers.format_mol import *

monomer=Chem.MolFromSmiles('OC[C@@H]1O[C@@H](O[C@@H]2[C@H]3CO[C@@H]2[C@H](O)[C@H](Br)O3)[C@@H](O)[C@H](O)[C@@H]1OBr')
AllChem.EmbedMolecule(monomer)
monomer = Chem.AddHs(monomer)
n_monomers:int = 2
polymer=Lp(monomer,"Br",n_monomers,shift=1.0).linear_polymer()
polymer.write("mol", f"/home/mpalermo/tmp/PySoftK/polymer_{n_monomers}.mol") # linear_polymer() outputs an openbabel object

When trying to make a polymer starting from the monomer, you get a BadConformerID error from RDKit. Upon further inspection, I have found this is due to the call to mol.GetConformer() in _copy_mol() from module linear_polymer/linear_polymer.py. I suspect that when no conformers are embedded in the mol object, GetConformer() tries to generate them - I'm not sure using which RDKit method - and fails.

As a workaround, I added an explicit call to generate 1 conformer:

    def _copy_mol(self):
       """Function to replicate super_monomers.

       Parameters
       -----------

       mol : rdkit.Chem.rdchem.Mol
          RDKit Mol object

       Return
       --------

       fragments : `list`
          List of RDKit Mol objects
       """    

       mol=self.mol
       cids = AllChem.EmbedMultipleConfs(mol,1) # <-- HERE - explicit call to generate conforms
       CanonicalizeConformer(mol.GetConformer())
       
       n_copies=self.n_copies
       
       fragments=[mol for _ in range(int(n_copies))]

       return fragments

With this modification, now PySoftK works also with Agarose. Nevertheless, while correctly functioning, it would still take hours to generate the polymer. I identified the source of this in the WeightedRotorSearch() call in function rotor_opt() from module tools/utils_ob.py. I think it may take a very long time for large molecules with lots of rotatable bonds. Commenting this out and adding a call to RDKit make3D() function before running OpenBabel Relaxation and Rotor optimization seems to be returning perfectly reasonable geometries anyway.

Is it ok if I send a pull request so that you can review these modifications and evaluate whether to integrate them into PySoftK?

Thank you!

mattiafelice-palermo avatar Nov 23 '23 08:11 mattiafelice-palermo