Mandalorion-old
Mandalorion-old copied to clipboard
Interpreting output BED files
Hi there. I've looked through the Mandalorion documentation here, but I still have a few clarifications regarding interpretation of some of the BED output files.
The following snippet is from the TESS.bed
output file. Does Sl
correspond to a TSS on the plus strand (l = left?), El
= TES on the minus strand, Sr
= TSS on minus strand, and Er
= TES on plus strand? Is a similar scheme used for the SS.bed
file (e.g. 5l
= 5' splice site on plus strand, etc.)? What does the _A
indicate at the end of the item in column 4? The range of these items is 20nt, so is the actual TSS/TES the midpoint of this range (e.g. in the first line, the TSS is at position 69660)?
chrII 69650 69670 Sl23_69650_69670_A 23
chrV 2164816 2164836 El35593_2164816_2164836_A 35593
chrV 9137708 9137728 Sr33582_9137708_9137728_A 33582
chrV 8349177 8349197 Er37846_8349177_8349197_A 37846
Also, is there any difference between SS.bed
and SS_raw.bed
? These files are identical for me.
TIA for any insights.
Hi Mallory,
Apologies for the late reply. I finally managed to take a vacation. The naming of features follow the scheme you already more or less deciphered.
Sl -> transcription Start site identified using the Left end of reads. This would indicate a plus strand transcript
5r -> 5' splice site called using the right end of a alignment gap. This would indicate a minus strand transcript
Finally, there is a difference between the SS_raw and SS.bed if you used a genome annotation. SS.bed contains all Splice sites - annotated and inferred.
Best, Chris
Thanks Chris.
I hope you enjoyed your vacation!