how_are_we_stranded_here
how_are_we_stranded_here copied to clipboard
this software cannot be used for the single-end RNA-seq data
command
check_strandedness --gtf /xxx/Mus_musculus.GRCm38.102.gtf --transcripts /xxx/Mus_musculus.GRCm38.dna.toplevel.fa SRR1538550.fastq.gz
log
usage: check_strandedness [-h] -g GTF [-fa TRANSCRIPTS] [-n NREADS] -r1 READS_1 -r2 READS_2 [-k KALLISTO_INDEX] [-p] check_strandedness: error: the following arguments are required: -r1/--reads_1, -r2/--reads_2
command
check_strandedness --gtf /xxx/Mus_musculus.GRCm38.102.gtf --transcripts /xxx/Mus_musculus.GRCm38.dna.toplevel.fa -r1 SRR1538550.fastq.gz
log
usage: check_strandedness [-h] -g GTF [-fa TRANSCRIPTS] [-n NREADS] -r1 READS_1 -r2 READS_2 [-k KALLISTO_INDEX] [-p] check_strandedness: error: the following arguments are required: -r2/--reads_2
It does still require the r2 file for single-end files. As a workaround, would it be OK to use the same single-end file for both the -r1 and -r2 options? Thank you.
I noticed that the check_strandedness.py code here has been updated to include an option for single-end reads; however, the conda install and the pip install available have not been updated since this change was made.
Here is my workaround for anyone else still struggling. I have pip installed how_are_we_stranded_here in a conda environment. Here are my yaml file contents (I used biopython for writing the required transcripts fasta file, but it is not necessary for running this tool):
name: checkStrandedness
channels:
- bioconda
- conda-forge
dependencies:
- biopython
- kallisto=0.44.0
- pip
- pip:
- how_are_we_stranded_here==1.0.1
From here, I found the bin/ directory of my conda environment checkStrandedness. For me this was /home/users/feyr/miniconda3/envs/checkStrandedness/bin. I made a copy of the file check_strandedness. I called mine “check_strandedness_single_end”.
Next I copied the updated code from the link above into a script in the same directory. I called mine count_strandedness_July2022.py so I know when it was updated, and to avoid double-naming errors.
In my copy, I modified the following line:
from how_are_we_stranded_here.check_strandedness import main
To read:
from check_strandedness_July2022 import main
You can test that it works by running the command “check_strandedness_single_end” on the commandline with no arguments (or if you named your copy something else, use that name). If the usage statement shows up with [-r2 READS_2]
like this, in square brackets, it is the updated version where the second FASTQ file is optional.
I hope this helps someone else wanting to use this tool for single-end data.