MMseqs2 GPU version, mmseqs expandaln err : getData: local id (4294967295) >= db size (17)

INFO:colabfold.mmseqs.search:Running mmseqs expandaln out3/qdb /mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx out3/res /mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx out3/res_exp --db-load-mode 0 --threads 18 --expansion-mode 0 -e inf --expand-filter-clusters 1 --max-seq-id 0.95 expandaln out3/qdb /mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx out3/res /mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx out3/res_exp --db-load-mode 0 --threads 18 --expansion-mode 0 -e inf --expand-filter-clusters 1 --max-seq-id 0.95

Index version: 16 Generated by: 2348eb6c754b4b7effb7f8471a9d19d3c0e917e5 ScoreMatrix: VTML80.out Index version: 16 Generated by: 2348eb6c754b4b7effb7f8471a9d19d3c0e917e5 ScoreMatrix: VTML80.out Invalid database read for database data file=/mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx, database index=/mnt/d/conda/jupyter/ColabFold/database/uniref30_2302_gpu.idx.index getData: local id (4294967295) >= db size (17)

// uniref30_2302_db & colabfold_envdb_202108_db database download from colabfolad database, and work well using colabfold/mmseq cpu version. then i change env to gpu version; error when using mmseqs-linux-gpu.tar.gz 20-Sep-2025 10:46, i have covert db by using mmseqs makepaddedseqdb uniref30_2302_db uniref30_2302_gpu; mmseqs createindex uniref30_2302_gpu; It's that right? & Could you tell me how to convert database correctly and how to solve "mmseqs expandaln error"？

thank you!

Sep 21 '25 10:09 kehan777

@milot-mirdita has updated the script. Could you please update your repository and try again?

Sep 21 '25 10:09 martin-steinegger

i have download https://dev.mmseqs.com/latest/mmseqs-linux-gpu.tar.gz 20-Sep-2025 10:46, mmseqs createdb examples/DB.fasta targetDB mmseqs makepaddedseqdb targetDB targetDB_padded mmseqs easy-search examples/QUERY.fasta targetDB_padded alnRes.m8 tmp --gpu 1,work well;

then i use colabfold_search, when mmseqs expandaln out3/qdb uniref30_2302_gpu.idx out3/res uniref30_2302_gpu.idx out3/res_exp --db-load-mode 0 --threads 18 --expansion-mode 0 -e inf --expand-filter-clusters 1 --max-seq-id 0.95

MMseqs Version: 1046260d43f8d721041dec43a1763ecc450a6ea9 Expansion mode 0 Substitution matrix aa:blosum62.out,nucl:nucleotide.out Gap open cost aa:11,nucl:5 Gap extension cost aa:1,nucl:2 Max sequence length 65535 Score bias 0 Compositional bias 1 Compositional bias scale 1 E-value threshold inf Seq. id. threshold 0 Coverage threshold 0 Coverage mode 0 Pseudo count mode 0 Pseudo count a substitution:1.100,context:1.400 Pseudo count b substitution:4.100,context:5.800 Expand filter clusters 1 Use filter only at N seqs 0 Maximum seq. id. threshold 0.95 Minimum seq. id. 0.0 Minimum score per column -20 Minimum coverage 0 Select N most diverse seqs 1000 Preload mode 0 Compressed 0 Threads 18 Verbosity 3

Index version: 16 Generated by: 2348eb6c754b4b7effb7f8471a9d19d3c0e917e5 ScoreMatrix: VTML80.out Index version: 16 Generated by: 2348eb6c754b4b7effb7f8471a9d19d3c0e917e5 ScoreMatrix: VTML80.out Invalid database read for database data file=uniref30_2302_gpu.idx, database index=uniref30_2302_gpu.idx.index getData: local id (4294967295) >= db size (17) , error

while i change ~/.bashrc env to mmseqs cpu version and using cpu database, success.

now the gpu version "mmseqs expandaln" still have read database problem..

Sep 21 '25 13:09 kehan777

Can you post a directory listing of the database directory (i.e. where uniref30_2302_gpu.idx is) and also post the contents of this file:

cat uniref30_2302_gpu.idx.index

Sep 22 '25 05:09 milot-mirdita

It looks like it did not add the _aln* database to the .idx. Not sure how this might have happened. I recommend to rerun the setup_databases.sh script (https://raw.githubusercontent.com/sokrypton/ColabFold/refs/heads/main/setup_databases.sh) from scratch.

Sep 22 '25 05:09 milot-mirdita