rnaseq-pipeline icon indicating copy to clipboard operation
rnaseq-pipeline copied to clipboard

Use --origfmt with fastq-dump

Open arteymix opened this issue 5 years ago • 0 comments

Currently, we use the SRA format for FASTQ headers which prefix the SRR run accession to the original string from the sequencer. This format is not compatible with ArrayExpress and local sources and will pose a problem if we try to generalize batch information extraction for arbitrary FASTQs and not just GEO series.

The solution is to add the --origfmt flag to fastq-dump so that the original header will be used instead.

This might require some adjustment in how Gemma parses the batch information.

arteymix avatar Jan 24 '20 18:01 arteymix