MMseqs2 icon indicating copy to clipboard operation
MMseqs2 copied to clipboard

GPU supported databases

Open rukibuki opened this issue 1 year ago • 5 comments

I was just wondering if any databases have been converted to support GPU already? I have converted uniref90 myself following the small section on that here on your github, but if you had already made all/many of the databases available in a GPU version I would much prefer to dl those.

rukibuki avatar Nov 27 '24 12:11 rukibuki

Currently, we haven't started any plans on prebuilt GPU databases (yet). I agree that this would be useful.

Currently, you have to call makepaddedseqdb on the target database after a createdb or databases call. See https://github.com/soedinglab/MMseqs2/wiki#gpu-accelerated-search

milot-mirdita avatar Nov 27 '24 13:11 milot-mirdita

yes that is exactly what I have done so far on uniref90, and it took some time to process that database. I am just unsure which database (or databases) to convert like that, if I want to use it/them for MSA generation for an AF3 pipeline afterwards. any suggestions?

rukibuki avatar Nov 27 '24 13:11 rukibuki

We plan to make MMseqs2-GPU databases available for ColabFold, which we also plan to optimize to AF3 (or to some of the free alternatives) at some point soon.

milot-mirdita avatar Nov 27 '24 15:11 milot-mirdita

FWIW, let me know if you're interested in hosting these optimized databases on AWS. We have a program to sponsor high-impact scientific data sets (e.g. https://registry.opendata.aws/openfold/) and this would def count

brianloyal avatar Dec 19 '24 20:12 brianloyal

@brianloyal it would be fantastic to host the database on AWS. I recall applying to AWS in the past, but unfortunately, we were rejected. Do you have any recommendations to improve our chances this time?

martin-steinegger avatar Jan 23 '25 14:01 martin-steinegger