pysradb icon indicating copy to clipboard operation
pysradb copied to clipboard

[BUG] Example download doesn't work

Open MrOlm opened this issue 1 year ago • 1 comments

Running the colab notebook (https://colab.research.google.com/github/saketkc/pysradb/blob/master/notebooks/02.Commandline_download.ipynb#scrollTo=YQLxy1yzH6dQ)

fails on the step

!pysradb srx-to-srr SRX4720625 --detailed | pysradb download

with the error


Using recommended_url instead.

Traceback (most recent call last):
  File "/usr/local/bin/pysradb", line 8, in <module>
    sys.exit(parse_args())
  File "/usr/local/lib/python3.7/dist-packages/pysradb/cli.py", line 1219, in parse_args
    args.threads,
  File "/usr/local/lib/python3.7/dist-packages/pysradb/cli.py", line 121, in download
    threads=threads,
  File "/usr/local/lib/python3.7/dist-packages/pysradb/sradb.py", line 1523, in download
    + ".sra"
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py", line 5487, in __getattr__
    return object.__getattribute__(self, name)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/accessor.py", line 181, in __get__
    accessor_obj = self._accessor(obj)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/strings/accessor.py", line 168, in __init__
    self._inferred_dtype = self._validate(data)
  File "/usr/local/lib/python3.7/dist-packages/pandas/core/strings/accessor.py", line 225, in _validate
    raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!```

MrOlm avatar Oct 10 '22 23:10 MrOlm

Thanks, I can confirm that it is currently broken. For now I could recommend dumping the metadata in a tsv and then using the sra_url field to download:

$ pysradb srx-to-srr SRX4720625 --detailed  --saveto x.tsv

experiment_accession	run_accession	study_accession	study_title	experiment_title	experiment_desc	organism_taxid 	organism_name	library_name	library_strategy	library_source	library_selection	library_layout	sample_accession	sample_title	instrument	instrument_model	instrument_model_desc	total_spots	total_size	run_total_spots	run_total_bases	run_alias	sra_url_alt	sra_url	AWS_url	AWS_free_egress	AWS_access_type	experiment_alias	source_name	tissue	developmental stage	gfp status	genetic background	ena_fastq_http	ena_fastq_http_1	ena_fastq_http_2	ena_fastq_ftp	ena_fastq_ftp_1	ena_fastq_ftp_2
SRX4720625 SRR7882015 SRP162234 Transcriptomic profile of zebrafish cardiomyocytes throughout heart development GSM3396533: wt_GFPpos_24hpf_rep1; Danio rerio; RNA-Seq GSM3396533: wt_GFPpos_24hpf_rep1; Danio rerio; RNA-Seq 7955 Danio rerio <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS3805811 <NA> NextSeq 500 NextSeq 500 ILLUMINA 47867961 3470385670 47867961 7230485009 GSM3396533_r1 s3://sra-pub-src-3/SRR7882015/RNA_cardio_pos_24hpf_rep_1_R2.fq.gz https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR7882015/SRR7882015 https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR7882015/SRR7882015 worldwide anonymous GSM3396533 FACS-sorted embryo cells FACS-sorted embryo cells 24 hpf GFP positive wild type <NA> http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR788/005/SRR7882015/SRR7882015_1.fastq.gz http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR788/005/SRR7882015/SRR7882015_2.fastq.gz <NA> [[email protected]](mailto:[email protected]):vol1/fastq/SRR788/005/SRR7882015/SRR7882015_1.fastq.gz [[email protected]](mailto:[email protected]):vol1/fastq/SRR788/005/SRR7882015/SRR7882015_2.fastq.gz

saketkc avatar Oct 10 '22 23:10 saketkc