CENTIPEDE.tutorial
CENTIPEDE.tutorial copied to clipboard
Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed
Hi, When I run the centipede_data , the result show : Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed. Maybe it is some problem with my bam file?
Thanks Zhen
Hi Zhen,
Thanks for opening an issue! If you could share a minimal example of code and subset of your data that reproduces this error, I can try to run it on my laptop and fix the problem.
Please feel free to reopen this issue if you can provide a minimal example.
Hey slowkow!
I get the same error as Zhen. Here is the snippet of code and the corresponding error:
>cen_P0_rep1 <- centipede_data(bam_file = "call-bowtie2/shard-0/execution/P0_rep1_R1.merged.bam", fimo_file = "centipede-files/P0_rep1.fimo.txt.gz", pval= 1e-4, flank_size = 100)
Parsed with column specification:
cols(
motif_id = col_character(),
motif_alt_id = col_character(),
sequence_name = col_character(),
start = col_double(),
stop = col_double(),
strand = col_character(),
score = col_double(),
`p-value` = col_double(),
`q-value` = col_logical(),
matched_sequence = col_character()
)
|=========================================================================| 100% 824 MB
Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed
The code ran for couple of hours before throwing this error. So seems like something happened near the end of execution. I get the same error even if I remove the parameters "flank size" and/or "pval".
Let me know if I can provide any more information or if you want me to open a new issue.
Thanks!
Hi Vatsal, thanks for sharing your code and error. That helps.
If your session is still live, it would help if you run traceback()
so we can see which line in which file caused the error.
The error might come from this code (line 163):
https://github.com/slowkow/CENTIPEDE.tutorial/blob/12811e78f035510c5d021bf5f4295be3fd935ffa/R/functions.R#L151-L165
If you can share data and code that I can run on laptop, then I might be able to help.
Otherwise, I would suggest that you try running the code from functions.R
by hand -- line by line in the R console -- and try to find out what is going wrong.
Good luck! Let me know if you figure it out!
Thanks for responding so quickly Kamil!
I did run the traceback command. Here's the output:
> traceback()
4: FUN(X[[i]], ...)
3: lapply(seq(1, length(bam_list)), function(i) {
region <- parse_region(names(bam_list)[i])
region_len <- abs(region$end - region$start) + 1
item <- bam_list[[i]]
neg_shift <- item$qwidth * as.numeric(item$strand == "-")
item$pos <- item$pos + neg_shift
idx <- item$pos >= region$start & item$pos <= region$end
if (sum(idx) == 0) {
return(rep(0, 2 * region_len))
}
strand <- item$strand[idx]
position <- item$pos[idx]
is.neg <- as.numeric(strand == "-")
j <- 1 + position - region$start + (region_len * is.neg)
as.numeric(table(factor(j, levels = seq(1, 2 * region_len))))
})
2: do.call(rbind, lapply(seq(1, length(bam_list)), function(i) {
region <- parse_region(names(bam_list)[i])
region_len <- abs(region$end - region$start) + 1
item <- bam_list[[i]]
neg_shift <- item$qwidth * as.numeric(item$strand == "-")
item$pos <- item$pos + neg_shift
idx <- item$pos >= region$start & item$pos <= region$end
if (sum(idx) == 0) {
return(rep(0, 2 * region_len))
}
strand <- item$strand[idx]
position <- item$pos[idx]
is.neg <- as.numeric(strand == "-")
j <- 1 + position - region$start + (region_len * is.neg)
as.numeric(table(factor(j, levels = seq(1, 2 * region_len))))
}))
1: centipede_data(bam_file = "call-bowtie2/shard-0/execution/P0_rep1_R1.merged.bam",
fimo_file = "centipede-files/P0_rep1.fimo.txt.gz", pval = 1e-04,
flank_size = 100)
I doubt this is very helpful for you. Since the input files are massive, would you like a trimmed down version of those files? I worry that these smaller files may disallow you to duplicate the exact error.
Otherwise, I would suggest that you try running the code from functions.R by hand -- line by line in the R console -- and try to find out what is going wrong.
I still haven't tried this so will let you know if I find something interesting.
So it turns out there was an issue with the BAM file. When I used the "deduped" BAM file, the problem seems to disappear. May be it is important for CENTIPEDE that the BAM file is without duplicates? I am not sure but at the moment that has fixed the issue.
Hi all,
I have the same problem. I tried removing duplicates but my data is a paired-end I have not been able to succeed in doing so. Has anyone looked into this issue? How did you remove duplicates?
Thank you.
Shoko