rnaseq-pipeline
rnaseq-pipeline copied to clipboard
RNA-seq pipeline for raw sequence alignment and transcript/gene quantification.
Batch info for split experiments is displayed all together. E.g. GSE147432 was split into 3 datasets: GSE147432.1, GSE147432.2, GSE147432.3, but the batch info for each one of them is identical...
We can get an idea of the thresholds to apply by looking at the data we gathered so far.
The real issue is that the tool does not exit with a non-zero code when it fails due to a download size limit.
As per https://github.com/PavlidisLab/GemmaCuration/issues/3#issuecomment-759691608, it would be valuable to perform some sanity checks before submitting data to Gemma. Here's a list of basic things that we can do: - [ ]...
I've encountered a couple of archives in SRA that replaces `:` separator by `_` in their FASTQ headers. We would have to handle this specific case in `ExtractGeoSeriesBatchInfo`.
There's been discussion about using a subset of [CWL](https://www.commonwl.org/) to produce a file with structured pipeline metadata. This process can be entirely automated by introspecting the Luigi task graph and...