ColabFold icon indicating copy to clipboard operation
ColabFold copied to clipboard

Alternative database download location?

Open james-vincent opened this issue 3 years ago • 2 comments

Are the databases on https://colabfold.mmseqs.com/ available anywhere else, like AWS? Download speeds are very slow and seem to be limited by at the mmseqs.com end.

Expected Behavior

Download UniRef30 database in about an hour if D/L speed was 150Mbps.

Current Behavior

~ 40 hours estimated to download UniRef30 from various institutions, all with 10Gbps internet connections.

Steps to Reproduce (for bugs)

Download: http://wwwuser.gwdg.de/~compbiol/colabfold/uniref30_2103.tar.gz

ColabFold Output (for bugs)

Please make sure to also post the complete ColabFold output. You can use gist.github.com for large output.

Context

Download of databases listed on https://colabfold.mmseqs.com/ is extremely slow. Attempted from multiple locations, different days, all with same result.

james-vincent avatar Feb 08 '22 18:02 james-vincent

I recommend to use aria2c with its -x parameter for multiple parallel download connections.

Sorry, the databases are so large that GWDG is currently the only option for us. Cloud providers would be extremely expensive. If you have any alternative mirrors we would be happy to upload the databases there.

milot-mirdita avatar Feb 08 '22 18:02 milot-mirdita