biopython PDBIO.save() include SEQRES information?

Setup

I have a question regarding Biopython version, Python version, and operating:

Python 3.9.12 (main, Jun  1 2022, 11:38:51) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; print(sys.version)
3.9.12 (main, Jun  1 2022, 11:38:51) 
[GCC 7.5.0]
>>> import platform; print(platform.python_implementation()); print(platform.platform())
CPython
Linux-5.15.0-46-generic-x86_64-with-glibc2.35
>>> import Bio; print(Bio.__version__)
1.79

Question

Hi Biopython developers,

I was wondering with the PDBIO.save() method is there any way to include the SEQRES information in the PDB file that is written out? My downstream processes require this information but currently writing out the PDB file with PDBIO does not preserve the SEQRES. May I know if this feature is available?

Thank you.

Aug 30 '22 06:08 JSLJ23

The PDB header parsing was a later addition to the core of the 3D data loading, and as far as I know there is no attempt to write a header. Would you be hoping for an automatically generated SEQRES based on the peptides (not sure if that is possible), or a recreation of the original header if you started by parsing a PDB file? In the later case, the user would probably have to modify the header entries to be consistent with any changes you have made to the structure.

I am not the best person to comment on the feasibility of either of the above ideas.

Aug 30 '22 08:08 peterjc

Actually my intended purpose is to pipe the written PDB file into PDBfixer to build missing residues and relax the structure for downstream molecular modelling (docking & MD). But PDBfixer needs the SEQRES section to be present to correctly impute the missing residues, if not it will just crudely join the unmodelled regions of the PDB.

If there's a separate module in Biopython that can parse the SEQRES portion and obtain a list of the 3 character amino acids, that works for me too because I can supply that to PDBfixer for it to align the structure and deduce the missing residues to build them.

May I know if there’s a module within Biopython that can specifically parse the SEQRES component of the PDB file?

Aug 31 '22 02:08 JSLJ23

Answering from my phone but yes, and see the PDB support in the SeqIO module for an example parsing the SEQRES lines.

Aug 31 '22 06:08 peterjc

This would be trivial to add, if the purpose is to write the sequence of the residues for which there are (at least one) coordinates.

If the intention is to read in the original seqres and then write it back, it's a trickier situation because of modifications you might make to the structure (specifically thinking about insertions here).

What exactly is your goal? Use PDBFixer to add missing sidechain atoms or entire missing residues? If the latter, I'd recommend against using PDBFixer and instead using modeller, because as you said, PDBFixer does a very very crude modeling job.

A quarta, 31/08/2022, 07:50, Peter Cock @.***> escreveu:

Answering from my phone but yes, and see the PDB support in the SeqIO module for an example parsing the SEQRES lines.

— Reply to this email directly, view it on GitHub https://github.com/biopython/biopython/issues/4064#issuecomment-1232532974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZ6EDDPOKEJO3JNBKMSA3V336DLANCNFSM6AAAAAAQACNHKI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Aug 31 '22 07:08 JoaoRodrigues

biopython biopython copied to clipboard

PDBIO.save() include SEQRES information?

Setup

Question

biopython
biopython copied to clipboard