ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

ColabFold fails to predict protein complex (invalid literal for int() with base 10: 'DUMMY' at MSA step)

Open ks-2025 opened this issue 6 months ago • 8 comments

Dear ColabFold Team,

I am trying to model the human Get1-Get3 complex using the entire length of Get3 and part of the Get1 protein (348 and 59 residues, as given in the EMBL tutorial (https://www.ebi.ac.uk/training/online/courses/alphafold/accessing-and-predicting-protein-structures-with-alphafold/predicting-protein-structures-with-colabfold-and-alphafold-colab/)). The only setting I changed from the default is choosing 'alphafold2_multimer_v3' as the model with no AMBER relaxation or templates (which is standard), but my search seems to fail at the MSA step (on ColabFold v1.5.5: AlphaFold2 using MMseqs2). This is the exact sequence I used:

MAAGVAGWGVEAEEFEDAPDVEPLEPTLSNIIEQRSLKWIFVGGKGGVGKTTCSCSLAVQLSKGRESVLIISTDPAHNISDAFDQKFSKVPTKVKGYDNLFAMEIDPSLGVAELPDEFFEEDNMLSMGKKMMQEAMSAFPGIDEAMSYAEVMRLVKGMNFSVVVFDTAPTGHTLRLLNFPTIVERGLGRLMQIKNQISPFISQMCNMLGLGDMNADQLASKLEETLPVIRSVSEQFKDPEQTTFICVCIAEFLSLYETERLIQELAKCKIDTHNIIVNQLVFPDPEKPCKMCEARHKIQAKYLDQMEDLYEDFHIVKLPLLPHEVRGADKVNTFSALLLEPYKPPSAQ::::::::::::::::::LQKDAEQESQMRAEIQDMKQELSTVNMMDEFARYARLERKINKMTDKLKTHVKARTAQL

Below is the error code:

Could not get MSA/templates for GET1GET3complex_42bc5: invalid literal for int() with base 10: 'DUMMY'

Traceback (most recent call last):
  File "/content/colabfold/batch.py", line 1280, in run
    = get_msa_and_templates(jobname, query_sequence, a3m_lines, result_dir, msa_mode, use_templates,
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/colabfold/batch.py", line 669, in get_msa_and_templates
    a3m_lines = run_mmseqs2(
                ^^^^^^^^^^^^
  File "/content/colabfold/colabfold.py", line 311, in run_mmseqs2
    M = int(line[1:].rstrip())
        ^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'DUMMY'

Can you help me understand what is causing this error, if it is a user/notebook issue, and if there's anything I can do to improve my query? If it is a bug, then please look into it and fix the issue.

Thanks for your help!

ks-2025 avatar Jul 05 '25 06:07 ks-2025

Which notebook is this? Is this the main notebook at colabfold.com / https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb? This sounds like some version incompatibility

milot-mirdita avatar Jul 05 '25 06:07 milot-mirdita

Yes, it's the same one

ks-2025 avatar Jul 05 '25 07:07 ks-2025

Could you try with only one : between the sequences? I think this is not supported anymore.

milot-mirdita avatar Jul 05 '25 07:07 milot-mirdita

Yeah, I tried it and it worked! Well, that solves my query. Thanks for figuring it out!!

ks-2025 avatar Jul 05 '25 08:07 ks-2025

I'm wondering whether I should mark it as closed or keep it open and let the administrators know that there is such a bug. What do you think?

ks-2025 avatar Jul 05 '25 08:07 ks-2025

We supported this syntax in the very beginning, maybe we should reintroduce support for this again. Let's leave this open for now

milot-mirdita avatar Jul 05 '25 09:07 milot-mirdita

Okay!

ks-2025 avatar Jul 05 '25 10:07 ks-2025

Sorry, I wasn't aware that you were one of its original developers. I'm new to this program/community service. But once more, thanks for looking into it!!

ks-2025 avatar Jul 05 '25 10:07 ks-2025