pdb2pqr
pdb2pqr copied to clipboard
How to handle charge and radii in mmCIF
We can use CIF files as input to PDB2PQR but how do we handle the atom charge and radii?
Using the mmcif_pdbx package, we can load PDB (atom_site) data from a CIF file using something like the following:
# Example code of how to get the atom_site container from a mmCIF file
from pdbx.reader import PdbxReader
@pytest.mark.parametrize("input_cif", ["1kip.cif", "1ffk.cif"], ids=str)
def test_data_file(input_cif):
"""Test data file input."""
input_path = DATA_DIR / Path(input_cif)
with open(input_path, "rt") as input_file:
reader = PdbxReader(input_file)
data_list = []
reader.read(data_list)
for item in data_list:
print(item.get_object("atom_site").print_it())
There are other dictionaries that have radius and charge.
For example, there is the chem_comp_atom.charge (integer) or chem_comp_atom.partial_charge(float) at (https://www.iucr.org/__data/iucr/cifdic_html/2/cif_mm.dic/index.html).
The question might be how to tie the atom_site(s) and the other dictionary sections together using _chem_comp_atom.atom_id to the _atom_site.label_atom_id.
@speleo3 and @orbeckst -- do you see any use cases where PQR-like information would be useful in mmCIF format? If not, we'll probably treat this as low priority. Thanks!
I'd be all for deprecating PQR and only using something mmCIF based instead. The use case would be that we could abandon PQR parsers :-)
That was my original request in https://github.com/Electrostatics/pdb2pqr/issues/34
Such a file could be a 100% valid mmCIF file with added radius and charge columns. I'm not sure though if _chem_comp_atom
properties are a good fit, that would require for example different residue names for two HIS
with different charge configuration. It would be much easier to add two custom columns to the _atom_site
table, and/or propose adding such columns to one of the official dictionaries.
Oops -- sorry about that! I re-opened the original issue.
From: Thomas Holder [email protected] Sent: Monday, January 25, 2021 11:05 PM To: Electrostatics/pdb2pqr [email protected] Cc: Nathan Baker [email protected]; Assign [email protected] Subject: Re: [Electrostatics/pdb2pqr] How to handle charge and radii in mmCIF (#175)
I'd be all for deprecating PQR and only using something mmCIF based instead. The use case would be that we could abandon PQR parsers :-)
Such a file could be a 100% valid mmCIF file with added radius and charge columns. I'm not sure though if _chem_comp_atom properties are a good fit, that would require for example different residue names for two HIS with different charge configuration. It would be much easier to add two custom columns to the _atom_site table, and/or propose adding such columns to one of the official dictionaries.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FElectrostatics%2Fpdb2pqr%2Fissues%2F175%23issuecomment-767348259&data=04%7C01%7C%7C3b372e00093847dee3ec08d8c1c8d3cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637472415581901206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rP75Z9vVqjYYghVPHlGXihorWkX%2B7nTpuUcd3NruSeU%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAOX7WFCX4AZVVBT4KMZQYLS3ZSVHANCNFSM4WSOMGBQ&data=04%7C01%7C%7C3b372e00093847dee3ec08d8c1c8d3cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637472415581911197%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=pNhyBMsRPxcLLvFLodUtIbo3NHW1w%2BD%2F8tYrSjFAeJ8%3D&reserved=0.
@speleo3 I think we could add the custom fields in the _atom_site table but I didn't know if that would create non-standard mmCIF files that could not then be parsed by other mmCIF parsers like https://github.com/biopython/biopython/blob/master/Bio/PDB/MMCIFParser.py
That is why I was wondering if there is another section that could be used to hold the charge and radius that would be accessible to the mmcif_pdbx parser but not break other parsers.
My concern would be that a user would use APBS or PDB2PQR and end up creating a mmCIF output file that would be incompatible with other mmCIF parsers in their chaining/pipeline processing.
Whats the status of the CIF output file?
I am working on it as quickly as I can. Would you like to help?
Yes. I can probably start dedicating some serious time mid next week.
Can y'all catch me up over the next few days on the status, implementation design, and what needs to be done?
Sure! I have some initial code that I'll post in a few days. The PDB -> CIF translation works well but I was holding off releasing it to get the CIF -> PDB part done. I'll remedy that shortly.
Thanks!
On Tue, Mar 16, 2021 at 9:28 PM Danny Diaz @.***> wrote:
Yes. I can probably start dedicating some serious time mid next week.
Can y'all catch me up over the next few days on the status, implementation design, and what needs to be done?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Electrostatics/pdb2pqr/issues/175#issuecomment-800784399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOX7WHFQAZWUREMORVJ2OLTEAVVLANCNFSM4WSOMGBQ .
Awesome. Let me know when you post the code for me to begin familiarizing myself.
Ill let you know when I finish up what I am working on and can transition over to this here in the next week or so.
For clarification, this is "few days" in COVID time: I'm still working on the code. I wrote most of it and then found a better way to do it so...
I am preparing slides/code for a talk this Friday.
I am also implementing the writing of CIF files functionality in our other library dependency (freesasa).
So quite honestly, sometime next week will probably be more realistic on my end.
Writing a PQR CIF file is the last loose end in our tech stack so I definitely want to hammer this out in the near future.
Glad we are openly communicating our timelines.
Ready to start contributing. I'm guessing it's the nathan/cif branch?
@speleo3 @sobolevnrm did we ever decide on the two custom field names in the _atom_site
table for the charge and radii?
No, but we should probably address this in https://github.com/Electrostatics/pdb2cif.
@danny305 -- I was going to redirect you over there as well for this thread.
Why don't we just use Gemmi to convert between the two?
Let's move this discussion to the other repo. Can you provide description of what Gemmi does over there? Thanks.