ColabFold fails to predict protein complex (invalid literal for int() with base 10: 'DUMMY' at MSA step)
Dear ColabFold Team,
I am trying to model the human Get1-Get3 complex using the entire length of Get3 and part of the Get1 protein (348 and 59 residues, as given in the EMBL tutorial (https://www.ebi.ac.uk/training/online/courses/alphafold/accessing-and-predicting-protein-structures-with-alphafold/predicting-protein-structures-with-colabfold-and-alphafold-colab/)). The only setting I changed from the default is choosing 'alphafold2_multimer_v3' as the model with no AMBER relaxation or templates (which is standard), but my search seems to fail at the MSA step (on ColabFold v1.5.5: AlphaFold2 using MMseqs2). This is the exact sequence I used:
MAAGVAGWGVEAEEFEDAPDVEPLEPTLSNIIEQRSLKWIFVGGKGGVGKTTCSCSLAVQLSKGRESVLIISTDPAHNISDAFDQKFSKVPTKVKGYDNLFAMEIDPSLGVAELPDEFFEEDNMLSMGKKMMQEAMSAFPGIDEAMSYAEVMRLVKGMNFSVVVFDTAPTGHTLRLLNFPTIVERGLGRLMQIKNQISPFISQMCNMLGLGDMNADQLASKLEETLPVIRSVSEQFKDPEQTTFICVCIAEFLSLYETERLIQELAKCKIDTHNIIVNQLVFPDPEKPCKMCEARHKIQAKYLDQMEDLYEDFHIVKLPLLPHEVRGADKVNTFSALLLEPYKPPSAQ::::::::::::::::::LQKDAEQESQMRAEIQDMKQELSTVNMMDEFARYARLERKINKMTDKLKTHVKARTAQL
Below is the error code:
Could not get MSA/templates for GET1GET3complex_42bc5: invalid literal for int() with base 10: 'DUMMY'
Traceback (most recent call last):
File "/content/colabfold/batch.py", line 1280, in run
= get_msa_and_templates(jobname, query_sequence, a3m_lines, result_dir, msa_mode, use_templates,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/colabfold/batch.py", line 669, in get_msa_and_templates
a3m_lines = run_mmseqs2(
^^^^^^^^^^^^
File "/content/colabfold/colabfold.py", line 311, in run_mmseqs2
M = int(line[1:].rstrip())
^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'DUMMY'
Can you help me understand what is causing this error, if it is a user/notebook issue, and if there's anything I can do to improve my query? If it is a bug, then please look into it and fix the issue.
Thanks for your help!
Which notebook is this? Is this the main notebook at colabfold.com / https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb? This sounds like some version incompatibility
Yes, it's the same one
Could you try with only one : between the sequences? I think this is not supported anymore.
Yeah, I tried it and it worked! Well, that solves my query. Thanks for figuring it out!!
I'm wondering whether I should mark it as closed or keep it open and let the administrators know that there is such a bug. What do you think?
We supported this syntax in the very beginning, maybe we should reintroduce support for this again. Let's leave this open for now
Okay!
Sorry, I wasn't aware that you were one of its original developers. I'm new to this program/community service. But once more, thanks for looking into it!!