pdb2pqr icon indicating copy to clipboard operation
pdb2pqr copied to clipboard

Hydrogen addition problems with non-experimental input structures

Open dargen3 opened this issue 3 years ago • 19 comments

Hello, I would like to report a strange behavior of the software tool pdb2pqr30.

I am using pdb2pqr30 to protonate a structure with a command: pdb2pqr30 --log-level DEBUG --with-ph 7.2 AF-P0DSE4-F1-model_v2.pdb AF-P0DSE4-F1-model_v2_protonated.pqr The version of pdb2pqr30 is: 3.4.1 OS is Ubuntu 21.10

But the structure is protonated probably wrong. image

Please help me if I am doing something wrong. If it is a software error, can be the error fixed, please? Alternatively, is there any estimate as to when the error might be fixed? PDB file can be downloaded from https://alphafold.ebi.ac.uk/entry/P0DSE4

Thank you. Regards, Schindler

dargen3 avatar Feb 18 '22 19:02 dargen3

Can you please tell us what specific aspect of protonation state is incorrect?

sobolevnrm avatar Feb 18 '22 20:02 sobolevnrm

Is it the 3 bonds on the lower left atom that is red or the atoms that are not connected in the lower middle of the image?

intendo avatar Feb 18 '22 20:02 intendo

Wow, this is really fast reaction! :) It should be standard serine. So, carbon (grey color) should have 2 hydrogens instead of 3. Moreover distance between carbon and hydrogen is too small (0.65A)

Screenshot from 2022-02-18 21-18-40

dargen3 avatar Feb 18 '22 20:02 dargen3

This seems like it could more likely be a visualization issue rather than an issue with PDB2PQR. The PQR files that PDB2PQR produces have no information about bonding. However, the visualization above is drawing odd bonds between atoms--this sometimes happens in programs such as VMD when the bonding information is inferred from the radii in the PQR file (rather than the built-in radii used for PDB files).

When I visualize the results of the calculation above with PyMOL, I get the following images which do not show the bonding issues in the issue above.

issue_304 issue_304a

sobolevnrm avatar Feb 21 '22 03:02 sobolevnrm

OK, missing bonds are probably error of Avogadro. But unresolved problem is, that carbon of serine has 3 hydrogens instead of 2. You can see it even in your picture from pymol too. image

dargen3 avatar Feb 24 '22 17:02 dargen3

Any progress here, please?

dargen3 avatar Mar 08 '22 08:03 dargen3

No, I haven't had time to work on this. Sorry.

sobolevnrm avatar Mar 08 '22 12:03 sobolevnrm

And do you have any idea if you'll ever have time to do that? We are planning to use pdb2pqr for a large Alphafold2 related project and I don't know whether to wait for the bug fix or find another tool please?

dargen3 avatar Apr 01 '22 12:04 dargen3

I will try to work on it this weekend. Sorry.

sobolevnrm avatar Apr 01 '22 12:04 sobolevnrm

But unresolved problem is, that carbon of serine has 3 hydrogens instead of 2. You can see it even in your picture from pymol too.

This is not an extra atom -- its the alpha carbon hydrogen rotated the wrong way. There's a problem with the input structure that is affecting hydrogen optimization; e.g., see the error messages generated by PDB2PQR:

2022-04-02 08:03:53,582 DEBUG:debump.py:200:find_residue_conflicts:SER A 12 HA is too close to SER A 12 CB 2022-04-02 08:03:53,583 DEBUG:debump.py:161:debump_biomolecule:Starting to debump SER A 12... 2022-04-02 08:03:53,584 DEBUG:debump.py:162:debump_biomolecule:Debumping cutoffs: 2.0 for heavy-heavy, 1.5 for hydrogen-heavy, and 1.0 for hydrogen-hydrogen. 2022-04-02 08:03:53,584 WARNING:debump.py:172:debump_biomolecule:WARNING: Unable to debump SER A 12

I've never seen an debumping issue like this before with experimentally derived structures which is why I suspect it is a problem with the input file. This will take a while to debug.

sobolevnrm avatar Apr 02 '22 15:04 sobolevnrm

This is not an extra atom -- its the alpha carbon hydrogen rotated the wrong way. There's a problem with the input structure that is affecting hydrogen optimization; e.g., see the error messages generated by PDB2PQR:

OK, that is my mistake. You are right. Thank you for error messages.

I've never seen an debumping issue like this before with experimentally derived structures which is why I suspect it is a problem with the input file. This will take a while to debug.

Do you want more problematic structures for debugging? Please let me know if you find a bug in the structure so I can inform the Alphafold developers.

dargen3 avatar Apr 04 '22 12:04 dargen3

If you can share some additional structures here, that would be helpful. Thank you.

sobolevnrm avatar Aug 06 '22 21:08 sobolevnrm

Just for documentation purposes, I have encountered this bug with other nn-predictors similar to Alphafold as well (ABodyBuilder2 -- specifically the nanobody predictor). Sadly I can not share the structures generated.

HankewieDanke avatar Feb 06 '23 14:02 HankewieDanke

I am sending UniProt codes to 10 more problematic structures from AlphaFold DB. UniProt, pH, problematic atom index P56641, 12.3, 13 A0A1Z1CH22, 10.9, 23 Q3SAF8, 10.4, 405 J3QJY3, 13.7, 467 B3H610, 8.8, 354 Q38F30, 11.1, 391 F6YG85, 3.0, 82 P0DSE4, 11.4, 190 J3QJY3, 3.9, 475 B3H610, 3.3, 357

All structures can by downloaded as https://alphafold.ebi.ac.uk/files/AF-{UniProt}-F1-model_v4.pdb

All structures were protonated by command: pdb2pqr30 --log-level DEBUG --noopt --titration-state-method propka --with-ph <ph> --pdb-output <pdb_output> <pdb_input> <pdb_output>

dargen3 avatar Feb 17 '23 12:02 dargen3

OK, missing bonds are probably error of Avogadro. But unresolved problem is, that carbon of serine has 3 hydrogens instead of 2. You can see it even in your picture from pymol too. image

Can you please share the PQR files for this or the other structures that are having problems?

sobolevnrm avatar Feb 19 '23 17:02 sobolevnrm

structures.zip There are 10 mentioned pqr files in the zip file. If you need to send more, let me know.

dargen3 avatar May 18 '23 12:05 dargen3

Hello,

can I expect some progress, please? Should i send more structures with errors? I plan to use pdb2pqr in a publication on predicted structures I will write during the fall.

dargen3 avatar Oct 09 '23 14:10 dargen3

This code is only supported by volunteer effort. Progress is based on the time those volunteers have available.

sobolevnrm avatar Oct 09 '23 15:10 sobolevnrm

Hello ! I too have been using pdb2pqr30 (v3.6.1) to protonate proteins. Having found this issue thread, now I see I am not the only one encountering this behaviour.

Here is the command I used: pdb2pqr30 --noopt --nodebump --pdb-output <pdb-output> <output> <input> --titration-state-method propka --with-ph 7.2 As you can see, all proteins' pH is set to 7.2.

Here are some few pqr files as exemples, each with the problematic atom number at the end of its name: proteins.zip If you need more pqr files, I am eager to provided you with them.

I will readily appreciate your effort to address this malfunction as I would really like to use specifically your tool for a paper which should be finished shortly.

kekasz avatar Oct 15 '23 15:10 kekasz