CRLF2 missed fusions
From the hg38 support pull request:
Based on my understanding, this is specific for short deletion derived CRLF2 fusion (e.g. P2RY8-CRLF2 in the paper). We can only detect this kind of fusion through "novel junction" from RNApeg output. However, sometimes RNApeg output does not put the junction as novel, that is the reason we need "next unless($line =~ m/novel/ || $line =~ m/chrX:1212/);" For "unless($line =~ m/chrX:1212/)", it means the criteria is not for CRLF2. Because CICERO wants to detect CRLF2 fusion sensitively.
Originally posted by @liqingti in https://github.com/stjude/CICERO/pull/24
Relevant RNApeg results for SJBALL020013_D1 (example from the paper):
chrX:1325492:+,chrX:1331449:+ 1 known CRLF2,P2RY8andCRLF2,uc004cpl.1 NM_001012288,P2RY8andCRLF2.1.eAug10,uc004cpl.1 0 0 1 1 0 chrX:1327798:+,chrX:1331449:+ 105 known ENSG00000205755,P2RY8andCRLF2,uc004cpm.1 ENST00000467626,P2RY8andCRLF2.1.dAug10,uc004cpm.1 74 44 61 90 12 chrX:1327801:+,chrX:1331449:+ 200 known ENSG00000205755,ENSG00000205755,ENSG00000205755,CRLF2,P2RY8andCRLF2,P2RY8andCRLF2,uc004cpk.1 ENST00000381566,ENST00000381567,ENST00000400841,NM_022148,P2RY8andCRLF2.1.bAug10,P2RY8andCRLF2.1.cAug10,uc004cpk.1 139 73 127 178 10 chrX:1328991:+,chrX:1331449:+ 3 novel 3 1 2 3 0 chrX:1331529:+,chrX:1655814:+ 23 known P2RY8andCRLF2 P2RY8andCRLF2.1.cAug10 19 11 12 19 3 chrX:1331564:+,chrX:1655814:+ 3 novel 2 2 1 2 1
Relevant fusion event:
P2RY8 chrX 1655814 - 5utr CRLF2 chrX 1331529 - coding > 28 28 255 211
Other fusions that may be relevant TAL1-STIL. Internal deletions in PAX5 and CTNNB1 (exon 3).