ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Generating template error

Open Abhishaike opened this issue 1 year ago • 4 comments

I'd like to get only the MSA + template, but am running into issues with the template features upon using get_msa_and_templates. It works perfectly fine when use_templates is False, but I get an hhsearch issue when it's set to True.

I'm also happy to move my template search away from Colabfold, I get the feeling that templates are still being worked on here. Is there an alternative library I could use to generate .pdb template files?

Here's a minimum reproducible error below.

bash:

mkdir fasta_vol
mkdir result

echo ">Sequence_1" >> fasta_vol/output1.fasta
echo "MSGMKK:LYEYTVTTLDEFL:EKLKEFILNTSKDKIYKLTITN" >> fasta_vol/output1.fasta
echo ">Sequence_2" >> fasta_vol/output2.fasta
echo "VKLPINGW:AVYVHRTLMSCPVGEAWSASACHDG" >> fasta_vol/output2.fasta

python:

import os
from colabfold.batch import get_msa_and_templates, get_queries, safe_filename, msa_to_str
from colabfold.utils import (DEFAULT_API_SERVER)
from pathlib import Path
import shutil

fasta_volume_path = 'fasta_vol'
a3m_volume_path = "a3m_vol"
msa_mode = 'mmseqs2_uniref_env'

queries, is_complex = get_queries(fasta_volume_path)
for job_number, (raw_jobname, query_sequence, _) in enumerate(queries):
    jobname = safe_filename(raw_jobname)
    (unpaired_msa, paired_msa, query_seqs_unique, query_seqs_cardinality, template_features) \
                  = get_msa_and_templates(jobname = jobname, 
                                          query_sequences = query_sequence, 
                                          result_dir = Path(a3m_volume_path), 
                                          msa_mode = msa_mode, 
                                          use_templates = True, 
                                          custom_template_path = None, 
                                          pair_mode = "unpaired_paired", 
                                          host_url = DEFAULT_API_SERVER)
    msa = msa_to_str(unpaired_msa, paired_msa, query_seqs_unique, query_seqs_cardinality)
    Path(a3m_volume_path).joinpath(f"{jobname}.a3m").write_text(msa)

Resulting error:

Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "venv/lib/python3.10/site-packages/colabfold/batch.py", line 780, in get_msa_and_templates
    template_feature = mk_template(
  File "venv/lib/python3.10/site-packages/colabfold/batch.py", line 172, in mk_template
    hhsearch_result = hhsearch_pdb70_runner.query(a3m_lines)
  File "venv/lib/python3.10/site-packages/alphafold/data/tools/hhsearch.py", line 86, in query
    process = subprocess.Popen(
  File "python/lib/python3.10/subprocess.py", line 971, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "python/lib/python3.10/subprocess.py", line 1847, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'hhsearch'

Abhishaike avatar Mar 10 '23 22:03 Abhishaike