Whippet.jl icon indicating copy to clipboard operation
Whippet.jl copied to clipboard

Option for reverse strand-specific data in bin/whippet-index.jl

Open areyesq89 opened this issue 3 years ago • 0 comments

Hi,

Thank you for developing whippet. Very useful tool!

I've a paired-end reverse strand-specific dataset where I tried to use alignments to the reference genome (bam files) to complement the junctions from the gtf file. I noticed that when I ran the index step, I was getting a message saying that there were 0 junctions found on the bam file. Digging into the code, I noticed that the line below in src/bam.jl was filtering out the splice junctions found in my bam files:

     # if is spliced process splice sites
     if isspliced(rec) && strand == strandpos(rec)
        known = process_spliced_record!( novelacc, noveldon, rec, known, oneknown )

As a test, I commented the second condition of the if statement, and I could recover back my spliced reads and complement the index with them.

     # if is spliced process splice sites
     if isspliced(rec) ## && strand == strandpos(rec)
        known = process_spliced_record!( novelacc, noveldon, rec, known, oneknown )

In case of reverse strand-specific data, that condition would throw away all junction data. Perhaps it would it be useful to add a "strand-specificity" parameter in bin/whippet-index.jl?

Best regards, Alejandro Reyes

areyesq89 avatar Jan 18 '21 13:01 areyesq89