openmm
openmm copied to clipboard
Adding new residue template
I am trying to incorporate non-canonical amino acids by the applyMutation() function. Is there a protocol to achieve this?
The pdb is added in the template library and also in the 'proteinResidues' list. What other modifications need to be made? The force field parameters will be the next step in order to run a simulation.
The trial amino acid in this case is hydroxyproline, which is already present in the charmm36 force field, so that won't be a problem.
I have attached my HYP.pdb, the PRO.pdb for comparison and the HYP residue snippet from charmm36.xml. The last three files are the original tripeptide (Gly-Ala-Gly) where the applyMutation(['ALA-2-HYP'], 'A') is performed. Both mutated peptides prior and after PDBFixer (add missing hydrogens etc.) are also attached.
I apologize for the file-overload but it would be a pity if a fault went unnoticed. The faulty mutation can also be visually inspected via PyMol (see png).
Can I get some feedback on my approach? Because I have the feeling that this could work.
Any help is greatly appreciated, thanks!
HYP.txt PRO.txt HYPfromCharmm36.txt
original.txt
mutant_prior_to_fixing.txt
mutant_after_fixing.txt
Did you figure this out? Sorry no one seems to have replied. I don't have a lot of experience working with non-canonical amino acids.
Dear Peter,
Yes, my mentor and me found a working method. It includes the addition of (in this case) adding HYP.pdb to pdbfixer/templates, adding the HYP residue in simtk/openmm/app/data/hydrogens.xml and also in simtk/openmm/app/data/residues.xml. The last adjustment is adding ‘HYP’ in the pdbfixer.py proteinResidues list. HYP was quite an easy one since it is already incorporated in the charmm36 forcefield, so adding other NCAA will need some extra adjustment.
I have noticed indeed that it is not standard practice in (bio) simulation software to incorporate NCAA (except for post-translational modifications), so I will create some documentation on my GitHub profile once a solid protocol is found.
Thanks anyway for the follow up! Have a nice evening.
Kind regards, Joeri
From: Peter Eastman @.> Date: Wednesday, 31 March 2021 at 21:43 To: openmm/openmm @.> Cc: Joeri Van Meerssche [email protected], State change @.***> Subject: Re: [openmm/openmm] Adding new residue template (#3083)
Did you figure this out? Sorry no one seems to have replied. I don't have a lot of experience working with non-canonical amino acids.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/openmm/openmm/issues/3083#issuecomment-811391133, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AO4RZEHOWBCK3Y4SEBEO4ADTGN3PLANCNFSM4Z4BAEEA.
I have reopened the issue since I got a request if any progress was made. I will post the current protocol in this comment so that anyone has access to it (made by Jelle Vekeman and me).
Adding XXX.pdb to /opt/anaconda3/envs/openmm/lib/python3.7/site-packages/pdbfixer/templates The XXX.pdb requires the correct OpenMM format. The way I did this is by drawing the molecule in eg. PyMol and converting the geometry output to pdb via OpenBabel.
(Additional) Adding the hydrogen residue template to /opt/anaconda3/envs/openmm/lib/python3.7/site packages/simtk/openmm/app/data/hydrogens.xml This code simply states where PDBFixer needs to add hydrogens. I needed this option in order to mutate certain residues in my NCAA. If you don’t need PDBFixer you can skip this step.
Adding the residue template to /opt/anaconda3/envs/openmm/lib/python3.7/site packages/simtk/openmm/app/data/residues.xml The residue bond topology is stored here. This template is required in order to run simulations.
Adding ‘XXX’ to the residue lists in the following python files
- envs/openmm/lib/python3.7/site-packages/simtk/openmm/app/pdbfile.py
- (Additional) envs/openmm/lib/python3.7/site-packages/pdbfixer/pdbfixer.py
Adding ‘XXX’ to forcefield.xml I am trying at the moment to incorporate my NCAA in charmm36.xml, so I don’t have a solid protocol for now. It is best that you create a separate ‘force field’, say XXX.xml, and load this next to charm36.xml in the simulation. This way, you can leave charmm36 in its original state to avoid any ambiguities.
Make sure that the residue name (here XXX) is consistent throughout all the modifications.
As an alternative to modifying the built in hydrogens.xml
and residues.xml
files, you can put your definitions in separate files and load them with loadHydrogenDefinitions() and loadBondDefinitions(). Also, the strictly correct approach is not to add any bond definitions, but instead to include CONECT records in the PDB file defining all the bonds. According to the PDB spec, CONECT records are required for nonstandard residues.
Some years later...
I have some code that I used for PEG-ylating residues, which maybe extended to other non-canonicals. In a nutshell, given the smiles string for the non-canonical sidechain and the insertion residue, such as script would:
- parameterize the non-canonical using the 'GAFFTemplateGenerator` from openmmforcefield with am1bcc charge assignment (which is not the best I know)
- graft the non-canonical sidechain to the backbone in the topology
- then add forcefield parameters between the existing backbone and the sidechain CB using amber protein ff parameters
This is of course pretty approximate since the sidechain is all GAFF, the backbone is all ff14SB, and the interface is ad-hic decided to be ff14SB as well. But its one first step towards such an automation. If you think its a good idea, I can make a PR to Modeller
. But since this involves openmmforcefield as well as openff toolkit's Molecule
API, what might be a better place to make the PR? Your thoughts @peastman @jchodera ?
I think openmmforcefields is the best place for it. It already depends on both OpenMM and OpenFF Toolkit, while neither of them depends on it.
I recommend using SMIRNOFFTemplateGenerator with the most recent OpenFF force field. It's more accurate than GAFF.
Dear Tanmoy @tanmoy7989 did you end up submitting a PR to openmmforcefields to build non-canonical protein residue templates? We are modeling a receptor with a covalent inhibitor, and we can't find a good OpenFF-based semiautomatic tool to parameterize it. Your effort appears to be the closest to a solution. Thank you in advance!
Any help would be really appreciated here.
I am testing a likely useful prototype. I will share the script with anyone who commits to help add the feature to openmmforcefields. ;-)
Hi @egallicc, terribly sorry that I missed this earlier. So, I did not submit a PR, since my strategy was not general enough. In my scenario, I had to acylate a residue with a fatty acid, and the only logical workflow I decided was:
- combine the side chain of this amino acid with the fatty acid (or whatever other PTM you want here) and write out the combined SMILES string for this modified sidechain.
- treat this modified sidechain as a small molecule and use the
GAFFTemplateGenerator
to generate gaff parameters for this modified residue sidechain. - for gaff compatibility, use amber forcefield for the backbone of this amino acid (as well as for the rest of the protein).
- since heavy atoms and hydrogens at the point where the sidechain begins are now gaff atom-types, manually fill in amber ff parameters for the bonds, angles and torsions for the inter-backbone-sidechain parameters for this residue.
All of the above can be automated, but it would be very specific to the amber-gaff combo. Also the GAFFTemplateGenerator
uses am1-bcc approximation for charge assignment, and depending on how much accuracy you are shooting for, that may or may not be acceptable for you.
Hope this helps. If you have a strategy, please let me know. I'd be to happy to help in any way to add it to openmmforcefields
Yes, my approach is similar but with OpenFF. I expect it would work with GAFF as well, I am not sure.
Right, I think that, short of a full bespoke parameterization, any effort in this direction will be a bit of a hack.
I have no reason to believe that my approach is any better than yours. Shall we compare notes and see which one is the best candidate for a openmmforcefields PR? Crafting it as a feature of the TemplateGenerator is probably the best way to make it usable to the community.