grabseqs All grabseqs SRA downloads failing

Looks like some changes on the NCBI side lead to failures in SRA downloads:

grabseqs sra SRR11733975
Traceback (most recent call last):
  File "/users/cdiener/miniconda3/envs/sra/bin/grabseqs", line 11, in <module>
    sys.exit(main())
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/__init__.py", line 58, in main
    metadata_agg = process_sra(args, zip_func)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 31, in process_sra
    metadata_agg)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 97, in get_sra_acc_metadata
    run_col = lines[0].index("Run")
ValueError: 'Run' is not in list

This seems to be caused by a hardcoded address to download the SRA manifest that is not reachable anymore.

Jun 28 '22 18:06 cdiener

Having exactly the same issue (tried a few min ago)

Jun 29 '22 00:06 AntonioBaeza

same issue

Jun 30 '22 01:06 Zeroo11

Thanks for reporting the issue! Looks like @cdiener is right on, http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?save=efetch&db=sra&rettype=runinfo&term= redirects to https://www.ncbi.nlm.nih.gov/sviewer/?db=sra&1%3Fdb=sra&rettype=runinfo&save=efetch&term= and no longer returns metadata. I'll try to figure out the proper endpoint for their API to hit for the SRA metadata. (and see if I can get the tests passing in the meantime).

This is probably due to NCBI retiring Trace.

Looking through the NCBI E-utils API documentation, I should be able to get the same metadata by:

Finding the identifiers associated with esearch, e.g. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=sra&term=PRJNA836386&retmax=999
Passing that id list to efetch, e.g. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=sra&id=22439955&rettype=fasta&retmode=text

I'll just have to move it from XML to tab-separated since it looks like the e-utils love XML. This approach also has the advantage of using a defined API, rather than that trace URL (which worked great but I think I found it originally on StackOverflow or something).

Jun 30 '22 20:06 louiejtaylor

You can also request JSON from esearch which should be easier to convert with Python, for instance for your example: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=sra&term=PRJNA836386&retmax=999&retmode=json .

Jul 01 '22 17:07 cdiener

Hello :) Is there any workaround until this will be fixed?

Jul 09 '22 17:07 GitUser42

Looks like some changes on the NCBI side lead to failures in SRA downloads:

grabseqs sra SRR11733975
Traceback (most recent call last):
  File "/users/cdiener/miniconda3/envs/sra/bin/grabseqs", line 11, in <module>
    sys.exit(main())
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/__init__.py", line 58, in main
    metadata_agg = process_sra(args, zip_func)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 31, in process_sra
    metadata_agg)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 97, in get_sra_acc_metadata
    run_col = lines[0].index("Run")
ValueError: 'Run' is not in list

This seems to be caused by a hardcoded address to download the SRA manifest that is not reachable anymore.

Try replacing /usr/local/lib/python3.6/site-packages/grabseqslib/sra.py line 94 with metadata = requests.get("https://trace.ncbi.nlm.nih.gov/Traces/sra-db-be/sra-db-be.cgi?rettype=runinfo&term="+pacc)

Jul 10 '22 15:07 zhengjxj

Thanks [zhengjxj] (https://github.com/zhengjxj). I replaced the info in the file you indicated and is working again!

Jul 17 '22 00:07 AntonioBaeza

thank you. it seems that the ncbi api changed.

Jul 18 '22 13:07 chansigit

Thanks ! @zhengjxj

Jun 11 '23 07:06 xiachenrui

Hi, is grabseqs sra facing the same problem? what would be the solution this time?

May 13 '24 17:05 AMMHasan

grabseqs grabseqs copied to clipboard

All grabseqs SRA downloads failing

grabseqs
grabseqs copied to clipboard