FRASER
FRASER copied to clipboard
Error: checkIntergenic on unlocalized and unplaced sequences
When using comprehensive gene annotations, unlocalized and unplaced sequences result in an error when attempting to annotate potential impact of splice junctions with annotatePotentialImpact
.
Debugging
- The error arises since no genes in the
txdb
are located on unlocalizedchr*_random
and unplacedchrUn_*
sequences. - This causes
min(distance(test_junction, refseq.genes), na.rm = TRUE)
to evaluate toInf
. https://github.com/c-mertes/FRASER/blob/e775832aef77c14aa3c414832735bd1b0b9a4e47/R/resultAnnotations.R#L817 -
distance(test_junction, refseq.genes)
in turn evaluates to a vector of exclusively NA's. -
min(c(NA, NA, ...), na.rm=TRUE)
is equivalent tomin(numeric(0))
, which evaluates toInf
. - As
Inf
is greater than 0, the conditionaldist > 0
evaluates toTRUE
, whereas the code block under the if-statement is erroneously executed. https://github.com/c-mertes/FRASER/blob/e775832aef77c14aa3c414832735bd1b0b9a4e47/R/resultAnnotations.R#L818-L830
Fix
This error can be resolved by modifying the condition of the if-statement from dist > 0
to is.finite(dist) && dist > 0
.
https://github.com/c-mertes/FRASER/blob/e775832aef77c14aa3c414832735bd1b0b9a4e47/R/resultAnnotations.R#L818
Error
1. annotatePotentialImpact(result = res_junc_dt, txdb = txdb, fds = fds)
6. addPotentialImpactLabels(annoResult, fds, txdb)
8. sapply(noLaps, function(i) {
. overlap <- to(findOverlaps(junctions_gr[i], exons))
. if (length(overlap) == 0) {
. return(checkIntergenic(junctions_gr, i, refseq.genes))
. }
. for (j in overlap) {
. exon_start = start(exons[j])
. exon_end = end(exons[j])
. if (exon_start <= start(junctions_gr[i]) & exon_end >=
. end(junctions_gr[i])) {
. if ((end(junctions_gr[i]) - start(junctions_gr[i]) +
. 1)%%3 != 0) {
. frs = "likely"
. }
. else {
. frs = "unlikely"
. }
. return(c("exonTruncation", frs))
. }
. }
. return(c("complex", "inconclusive"))
. })
9. lapply(X = X, FUN = FUN, ...)
10. FUN(X[[i]], ...)
11. checkIntergenic(junctions_gr, i, refseq.genes)
12. start(refseq.genes[nearest(junctions_gr[i], refseq.genes)])
13. refseq.genes[nearest(junctions_gr[i], refseq.genes)]
14. refseq.genes[nearest(junctions_gr[i], refseq.genes)]
15. subset_along_ROWS(x, i, drop = drop)
16. extractROWS(x, i)
17. extractROWS(x, i)
18. normalizeSingleBracketSubscript(i, x, as.NSBS = TRUE)
19. NSBS(i, x, exact = exact, strict.upper.bound = !allow.append,
. allow.NAs = allow.NAs)
20. NSBS(i, x, exact = exact, strict.upper.bound = !allow.append,
. allow.NAs = allow.NAs)
21. .subscript_error("subscript contains NAs")
22. stop(wmsg(...), call. = FALSE)
23. .handleSimpleError(function (cond)
. .Internal(C_tryCatchHelper(addr, 1L, cond)), "subscript contains NAs",
. base::quote(NULL))
24. h(simpleError(msg, call))
Hi @jakobjn ,
thanks for reporting this! I will look into this and come back to you.
Can you add which FRASER / R / Bionconductor version you are using and provide an example of a txdb
object that results in the error?
@ischeller I'm using FRASER 1.99.3 (1b0cd9581ba1b1ae2bf329821c3a5823ef6505b7) as a Conda package built from this custom recipe. This is to exclusively utilize Conda for package management.
Thanks for the report on this. I merged it now. Please let us know if you still encounter issues.