pysoftk
pysoftk copied to clipboard
Failed linear polymerization with Agarose monomer and large, flexible molecules take too long
Dear Alejandro,
thank you very much for sharing PySoftK, I found it very useful for easily building polymers.
I have run into some difficulties with some molecules unfortunately, such as Agarose:
from pysoftk.linear_polymer.super_monomer import *
from pysoftk.linear_polymer.linear_polymer import *
from pysoftk.format_printers.format_mol import *
monomer=Chem.MolFromSmiles('OC[C@@H]1O[C@@H](O[C@@H]2[C@H]3CO[C@@H]2[C@H](O)[C@H](Br)O3)[C@@H](O)[C@H](O)[C@@H]1OBr')
AllChem.EmbedMolecule(monomer)
monomer = Chem.AddHs(monomer)
n_monomers:int = 2
polymer=Lp(monomer,"Br",n_monomers,shift=1.0).linear_polymer()
polymer.write("mol", f"/home/mpalermo/tmp/PySoftK/polymer_{n_monomers}.mol") # linear_polymer() outputs an openbabel object
When trying to make a polymer starting from the monomer, you get a BadConformerID
error from RDKit. Upon further inspection, I have found this is due to the call to mol.GetConformer()
in _copy_mol()
from module linear_polymer/linear_polymer.py
. I suspect that when no conformers are embedded in the mol object, GetConformer()
tries to generate them - I'm not sure using which RDKit method - and fails.
As a workaround, I added an explicit call to generate 1 conformer:
def _copy_mol(self):
"""Function to replicate super_monomers.
Parameters
-----------
mol : rdkit.Chem.rdchem.Mol
RDKit Mol object
Return
--------
fragments : `list`
List of RDKit Mol objects
"""
mol=self.mol
cids = AllChem.EmbedMultipleConfs(mol,1) # <-- HERE - explicit call to generate conforms
CanonicalizeConformer(mol.GetConformer())
n_copies=self.n_copies
fragments=[mol for _ in range(int(n_copies))]
return fragments
With this modification, now PySoftK works also with Agarose. Nevertheless, while correctly functioning, it would still take hours to generate the polymer. I identified the source of this in the WeightedRotorSearch()
call in function rotor_opt()
from module tools/utils_ob.py
. I think it may take a very long time for large molecules with lots of rotatable bonds. Commenting this out and adding a call to RDKit make3D()
function before running OpenBabel Relaxation and Rotor optimization seems to be returning perfectly reasonable geometries anyway.
Is it ok if I send a pull request so that you can review these modifications and evaluate whether to integrate them into PySoftK?
Thank you!