ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Database . does not exist

Open biochristmas opened this issue 5 months ago • 7 comments

Hi, I'm encountering an issue while trying to run colabfold_search. I would appreciate any suggestions you might have for resolving it. Thank you in advance!

“colabfold_search test.fasta ./colabfold_db/ test --af3-json --use-templates 1 INFO:colabfold.mmseqs.search:Running mmseqs createdb test/query.fas test/qdb --shuffle 0 createdb test/query.fas test/qdb --shuffle 0

Converting sequences

Time for merging to qdb_h: 0h 0m 0s 1ms Time for merging to qdb: 0h 0m 0s 0ms Database type: Aminoacid Time for processing: 0h 0m 0s 6ms Traceback (most recent call last): File "/data/igem/dxxu/ColabFold_MMseqsGPU/localcolabfold/colabfold-conda/bin/colabfold_search", line 8, in sys.exit(main()) File "/data/igem/dxxu/ColabFold_MMseqsGPU/localcolabfold/colabfold-conda/lib/python3.10/site-packages/colabfold/mmseqs/search.py", line 461, in main mmseqs_search_monomer( File "/data/igem/dxxu/ColabFold_MMseqsGPU/localcolabfold/colabfold-conda/lib/python3.10/site-packages/colabfold/mmseqs/search.py", line 93, in mmseqs_search_monomer raise FileNotFoundError(f"Database {db} does not exist") FileNotFoundError: Database . does not exist“

biochristmas avatar Jul 26 '25 09:07 biochristmas

Could you post the output of ls ./colabfold_db/

milot-mirdita avatar Jul 26 '25 09:07 milot-mirdita

Sure, the output of ls ./colabfold_db/ is ”COLABDB_READY colabfold_envdb_202108_aln.tsv colabfold_envdb_202108_db colabfold_envdb_202108_db_aln colabfold_envdb_202108_db_aln.dbtype colabfold_envdb_202108_db_aln.index colabfold_envdb_202108_db.dbtype colabfold_envdb_202108_db.GPU_READY colabfold_envdb_202108_db_h colabfold_envdb_202108_db_h.dbtype colabfold_envdb_202108_db_h.index colabfold_envdb_202108_db.idx colabfold_envdb_202108_db.idx.dbtype colabfold_envdb_202108_db.idx.index colabfold_envdb_202108_db.index colabfold_envdb_202108_db.lookup colabfold_envdb_202108_db_seq colabfold_envdb_202108_db_seq.dbtype colabfold_envdb_202108_db_seq_h colabfold_envdb_202108_db_seq_h.dbtype colabfold_envdb_202108_db_seq_h.index colabfold_envdb_202108_db_seq.index colabfold_envdb_202108_h.tsv colabfold_envdb_202108_seq.tsv colabfold_envdb_202108.tar.gz colabfold_envdb_202108.tsv DOWNLOADS_READY pdb pdb100_230517 pdb100_230517.dbtype pdb100_230517.fasta.gz pdb100_230517_h pdb100_230517_h.dbtype pdb100_230517_h.index pdb100_230517.idx pdb100_230517.idx.dbtype pdb100_230517.idx.index pdb100_230517.index pdb100_230517.lookup pdb100_230517_tmp_h pdb100_230517_tmp_h.dbtype pdb100_230517_tmp_h.index pdb100_a3m.ffdata pdb100_a3m.ffindex pdb100_foldseek_230517.tar.gz PDB100_READY PDB_MMCIF_READY PDB_READY tmp1 tmp2 tmp3 uniref30_2302_aln.tsv uniref30_2302_db uniref30_2302_db_aln uniref30_2302_db_aln.dbtype uniref30_2302_db_aln.index uniref30_2302_db.dbtype uniref30_2302_db.GPU_READY uniref30_2302_db_h uniref30_2302_db_h.dbtype uniref30_2302_db_h.index uniref30_2302_db.idx uniref30_2302_db.idx.dbtype uniref30_2302_db.idx.index uniref30_2302_db.idx_mapping uniref30_2302_db.idx_taxonomy uniref30_2302_db.index uniref30_2302_db.lookup uniref30_2302_db_mapping uniref30_2302_db_seq uniref30_2302_db_seq.dbtype uniref30_2302_db_seq_h uniref30_2302_db_seq_h.dbtype uniref30_2302_db_seq_h.index uniref30_2302_db_seq.index uniref30_2302_db_taxonomy uniref30_2302_h.tsv uniref30_2302.md5sum uniref30_2302_seq.tsv uniref30_2302.tar.gz uniref30_2302.tsv UNIREF30_READY “.

biochristmas avatar Jul 26 '25 11:07 biochristmas

That is a very confusing error message but I think I understand whats going on.

We don't set a default path for the template database apperently. You need to call colabfold_search ... --db2 pdb100_230517. db1 and db3 have default database names (uniref30_2302_db and colabfold_envdb_202108_db respectively)

milot-mirdita avatar Jul 29 '25 05:07 milot-mirdita

"prefilter mark_test/test/prof_res colabfold_db_GPU/pdb100_230517.idx mark_test/test/tmp2/2421078915233214753/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -k 0 --target-search-mode 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --mask-n-repeat 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 64 --compressed 0 -v 3 -s 7.5

Index version: 16 Generated by: 17.b804f ScoreMatrix: VTML80.out Query database size: 1 type: Profile Estimated memory consumption: 1G Target database size: 329605 type: Aminoacid Invalid database read for database data file=colabfold_db_GPU/pdb100_230517.idx, database index=colabfold_db_GPU/pdb100_230517.idx.index getData: local id (4294967295) >= db size (17) Error: Prefilter died Traceback (most recent call last): File "/share/home/mark/software/ColabFold/localcolabfold/colabfold-conda/bin/colabfold_search", line 8, in sys.exit(main()) File "/share/home/mark/software/ColabFold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/colabfold/mmseqs/search.py", line 461, in main mmseqs_search_monomer( File "/share/home/mark/software/ColabFold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/colabfold/mmseqs/search.py", line 172, in mmseqs_search_monomer run_mmseqs(mmseqs, ["search", base.joinpath("prof_res"), dbbase.joinpath(template_db), base.joinpath("res_pdb"), File "/share/home/mark/software/ColabFold/localcolabfold/colabfold-conda/lib/python3.10/site-packages/colabfold/mmseqs/search.py", line 46, in run_mmseqs subprocess.check_call([mmseqs] + params) File "/share/home/mark/software/ColabFold/localcolabfold/colabfold-conda/lib/python3.10/subprocess.py", line 369, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '[PosixPath('mmseqs'), 'search', PosixPath('mark_test/test/prof_res'), PosixPath('colabfold_db_GPU/pdb100_230517'), PosixPath('mark_test/test/res_pdb'), PosixPath('mark_test/test/tmp2'), '--db-load-mode', '0', '--threads', '64', '-s', '7.5', '-a', '-e', '0.1', '--prefilter-mode', '0']' returned non-zero exit status 1."

biochristmas avatar Jul 29 '25 08:07 biochristmas

Hi, thank you for your response. I followed your suggestions and adjusted the parameters, but I'm still encountering errors. However, the error message seems to have changed. I used the following command: "colabfold_search ./test.fasta ./colabfold_db_GPU/ ./mark_test/test --gpu 1 --af3-json --use-templates 1 --db2 pdb100_230517"

biochristmas avatar Jul 29 '25 08:07 biochristmas

I just pushed a fix (379d101a0a6378243405e962939d104e84d2085a) so the template search also uses gpu correctly

milot-mirdita avatar Aug 04 '25 05:08 milot-mirdita

Thank you very much for your response. The issue has been resolved after updating ColabFold. However, I noticed that when using the --use-templates and --af3-json parameters, the generated JSON file for AlphaFold3 still contains an empty template field, and only generates a separate .m8 file. This is different from the JSON file content produced by AlphaFold3's data pipeline. Is there a way to incorporate the template information directly into the JSON file?

biochristmas avatar Aug 05 '25 08:08 biochristmas