pysradb
pysradb copied to clipboard
[BUG] Example download doesn't work
Running the colab notebook (https://colab.research.google.com/github/saketkc/pysradb/blob/master/notebooks/02.Commandline_download.ipynb#scrollTo=YQLxy1yzH6dQ)
fails on the step
!pysradb srx-to-srr SRX4720625 --detailed | pysradb download
with the error
Using recommended_url instead.
Traceback (most recent call last):
File "/usr/local/bin/pysradb", line 8, in <module>
sys.exit(parse_args())
File "/usr/local/lib/python3.7/dist-packages/pysradb/cli.py", line 1219, in parse_args
args.threads,
File "/usr/local/lib/python3.7/dist-packages/pysradb/cli.py", line 121, in download
threads=threads,
File "/usr/local/lib/python3.7/dist-packages/pysradb/sradb.py", line 1523, in download
+ ".sra"
File "/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py", line 5487, in __getattr__
return object.__getattribute__(self, name)
File "/usr/local/lib/python3.7/dist-packages/pandas/core/accessor.py", line 181, in __get__
accessor_obj = self._accessor(obj)
File "/usr/local/lib/python3.7/dist-packages/pandas/core/strings/accessor.py", line 168, in __init__
self._inferred_dtype = self._validate(data)
File "/usr/local/lib/python3.7/dist-packages/pandas/core/strings/accessor.py", line 225, in _validate
raise AttributeError("Can only use .str accessor with string values!")
AttributeError: Can only use .str accessor with string values!```
Thanks, I can confirm that it is currently broken. For now I could recommend dumping the metadata in a tsv and then using the sra_url
field to download:
$ pysradb srx-to-srr SRX4720625 --detailed --saveto x.tsv
experiment_accession run_accession study_accession study_title experiment_title experiment_desc organism_taxid organism_name library_name library_strategy library_source library_selection library_layout sample_accession sample_title instrument instrument_model instrument_model_desc total_spots total_size run_total_spots run_total_bases run_alias sra_url_alt sra_url AWS_url AWS_free_egress AWS_access_type experiment_alias source_name tissue developmental stage gfp status genetic background ena_fastq_http ena_fastq_http_1 ena_fastq_http_2 ena_fastq_ftp ena_fastq_ftp_1 ena_fastq_ftp_2
SRX4720625 SRR7882015 SRP162234 Transcriptomic profile of zebrafish cardiomyocytes throughout heart development GSM3396533: wt_GFPpos_24hpf_rep1; Danio rerio; RNA-Seq GSM3396533: wt_GFPpos_24hpf_rep1; Danio rerio; RNA-Seq 7955 Danio rerio <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS3805811 <NA> NextSeq 500 NextSeq 500 ILLUMINA 47867961 3470385670 47867961 7230485009 GSM3396533_r1 s3://sra-pub-src-3/SRR7882015/RNA_cardio_pos_24hpf_rep_1_R2.fq.gz https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR7882015/SRR7882015 https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR7882015/SRR7882015 worldwide anonymous GSM3396533 FACS-sorted embryo cells FACS-sorted embryo cells 24 hpf GFP positive wild type <NA> http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR788/005/SRR7882015/SRR7882015_1.fastq.gz http://ftp.sra.ebi.ac.uk/vol1/fastq/SRR788/005/SRR7882015/SRR7882015_2.fastq.gz <NA> [[email protected]](mailto:[email protected]):vol1/fastq/SRR788/005/SRR7882015/SRR7882015_1.fastq.gz [[email protected]](mailto:[email protected]):vol1/fastq/SRR788/005/SRR7882015/SRR7882015_2.fastq.gz