sourmash
sourmash copied to clipboard
Genbank downloading problems
I'm having difficulties to download Genbank databases. I was able to download GTDB, and Genbank viral k31. Is there any other place I can find these files? Please suggest how I can download them. I use curl and usually i get this error: curl: (33) HTTP server doesn't seem to support byte ranges, or just time out.
Hi sourmash creators! Thanks a lot for your work! I am having the exact same problem as being described above. Do you have any idea on how to solve it? it seems that the server we are downloading from times out after certain amount of time.
Thanks in advance! Gabri
hi! sorry about this, it's been difficult to find good places to store these files ;(.
these problems occur when using the dweb.link URLs at https://sourmash.readthedocs.io/en/latest/databases.html, right? If so there are some options discussed here but it is not simple at the moment... I'll see if I can document it more clearly today.
yes -- they've been plaguing me for days. another potential alternative -- the OSF links are super fast. are you trying to move away from google drive to OSF for the large files? in the short term can we update the documentation so that it downloads from the links on OSF until we fix the dweb links?
some solutions while we don't move everything to R2:
remove https://dweb.link/ipfs/
from download URL
- for
genbank-2022.03-viral-k21.zip
:https://dweb.link/ipfs/bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi
->bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi
with the cloudflare gateway:
-
wget -O genbank-2022.03-viral-k21.zip https://cloudflare-ipfs.com/ipfs/bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi
with ipget
:
- grab ipget from https://dist.ipfs.io/#ipget
-
ipget -O genbank-2022.03-viral-k21.zip bafybeicjyx6qkhdtw6q4cxs6fyl46gqfhd4q5eqje5lkswf2npljnyytzi
Hello! I am pleased to report that our databases may now be Robustly Available via the local UC Davis infrastructure of the dib-lab ;).
Please see #2255 for the PR; you can view the databases file directly here until that PR is merged, at which point it will show up here.
I will update this issue once the PR is merged (which should be fairly quickly).
🎉
Merged & docs updated: prepared databases page here now has robustified farm links.