how_are_we_stranded_here icon indicating copy to clipboard operation
how_are_we_stranded_here copied to clipboard

this software cannot be used for the single-end RNA-seq data

Open jingydz opened this issue 2 years ago • 2 comments

command

check_strandedness --gtf /xxx/Mus_musculus.GRCm38.102.gtf --transcripts /xxx/Mus_musculus.GRCm38.dna.toplevel.fa SRR1538550.fastq.gz

log

usage: check_strandedness [-h] -g GTF [-fa TRANSCRIPTS] [-n NREADS] -r1 READS_1 -r2 READS_2 [-k KALLISTO_INDEX] [-p] check_strandedness: error: the following arguments are required: -r1/--reads_1, -r2/--reads_2

command

check_strandedness --gtf /xxx/Mus_musculus.GRCm38.102.gtf --transcripts /xxx/Mus_musculus.GRCm38.dna.toplevel.fa -r1 SRR1538550.fastq.gz

log

usage: check_strandedness [-h] -g GTF [-fa TRANSCRIPTS] [-n NREADS] -r1 READS_1 -r2 READS_2 [-k KALLISTO_INDEX] [-p] check_strandedness: error: the following arguments are required: -r2/--reads_2

jingydz avatar May 08 '22 16:05 jingydz

It does still require the r2 file for single-end files. As a workaround, would it be OK to use the same single-end file for both the -r1 and -r2 options? Thank you.

rdistefano avatar Nov 09 '22 16:11 rdistefano

I noticed that the check_strandedness.py code here has been updated to include an option for single-end reads; however, the conda install and the pip install available have not been updated since this change was made.

Here is my workaround for anyone else still struggling. I have pip installed how_are_we_stranded_here in a conda environment. Here are my yaml file contents (I used biopython for writing the required transcripts fasta file, but it is not necessary for running this tool):

name: checkStrandedness
channels:
  - bioconda
  - conda-forge
dependencies:
  - biopython
  - kallisto=0.44.0
  - pip
  - pip:
    - how_are_we_stranded_here==1.0.1

From here, I found the bin/ directory of my conda environment checkStrandedness. For me this was /home/users/feyr/miniconda3/envs/checkStrandedness/bin. I made a copy of the file check_strandedness. I called mine “check_strandedness_single_end”.

Next I copied the updated code from the link above into a script in the same directory. I called mine count_strandedness_July2022.py so I know when it was updated, and to avoid double-naming errors.

In my copy, I modified the following line: from how_are_we_stranded_here.check_strandedness import main To read: from check_strandedness_July2022 import main

You can test that it works by running the command “check_strandedness_single_end” on the commandline with no arguments (or if you named your copy something else, use that name). If the usage statement shows up with [-r2 READS_2] like this, in square brackets, it is the updated version where the second FASTQ file is optional.

I hope this helps someone else wanting to use this tool for single-end data.

rfey avatar Mar 10 '23 16:03 rfey