DeepAb
DeepAb copied to clipboard
error while trying to predict the structure
i have a sequence of an antibody with a linker of GGGGSGGGGSGGGGS. i added divided the sequence into the heavy and light chains and tried to predict the structure however it produced this error in the step of Annotate predicted structure with output attention:
PDB must have a chain with chain id "[PBD ID]:H" /usr/local/lib/python3.7/dist-packages/Bio/PDB/PDBParser.py:399: PDBConstructionWarning: Ignoring unrecognized record 'pdbpat' at line 1 PDBConstructionWarning, An exception has occurred, use %tb to see the full traceback.
SystemExit: -1
also the .deepab.pdb file generated in the does not contain any atoms (it has pdbpatchnumbering: Unable to read patch file)
i am using the colab version
this worked fine when i removed the renumber option. However there is a new bug. when I add a sequence in the heavy and light chain with the renumber option not selected it produces a file with two chains very far from each other
Can you share an example of a fasta file that produces these errors? I can try to reproduce the errors and update the code accordingly.
EVKLQESEREQTTNGAWLNVKAYPHISSSKATLPKPSRTAPPTGDQMSKHLSLVFSSTYCDIQGFSQSVFRPGPICVVCVQSWSYLCTVCPILVLSVYVFSPGPICVHVCPVLVPSVYPLVPGGGGSGGGGSGGGGSDIVLTQSPWMVDVSNKFTVKYKTRGHYDPETLSQPEPAALVTLNCPFPGAHSSQPRTPVFASLSGLKSKPHLFLDSAHQPSANIPVTVPLLMHILPGSWDAAPTV
You may want to format your fasta like in the example -- with separately indicated H and L chains.
Hi @hima111997 just wanted to give a quick update.
I ran your sequences in the Colab notebook and am getting strange results as well. When I simply remove the linker, it looks like the chains are coming out almost perfectly linear. I tried a few strategies, including truncating some of the ends near the linker and got the same results.
I also tried to make these prediction with our more recent IgFold model, and I get another failure. However, this time I got an overlay of both chains as the output. I also tried to predict the chains individually with IgFold and got poor predictions for both. I'll dig in more and see if I can find what's confusing to the model about these sequences.