numbat
numbat copied to clipboard
Error in if (UPGMA_score > NJ_score)
Error caused by providing segs_loh
(ER3 in ovarian visium dataset)
Building phylogeny ..
Mem used: 0.517Gb
Using 9 CNVs to construct phylogeny
Aggregate function missing, defaulting to 'length'
Error in if (UPGMA_score > NJ_score) { :
missing value where TRUE/FALSE needed
Calls: run_numbat
In addition: Warning messages:
2: In log(1 - P) : NaNs produced
3: In log(1 - P) : NaNs produced
Execution halted
Decided not to fix now because it changes results of NCI-N87. Fix is in branch segs_loh
Hi Teng,
Sorry, still getting this error using the segs_loh
branch.
Here's the full log below. Grateful for your work with numbat and for any advice you may have – thanks.
Found 9 regions with LOH/deletions.
# A tibble: 9 × 6
CHROM seg seg_start seg_end snp_rate loh
<fct> <fct> <int> <int> <dbl> <lgl>
1 1 1b 120150898 145992442 1.31 TRUE
2 1 1d 148102046 149903320 8.98 TRUE
3 2 2b 186694060 189031898 7.66 TRUE
4 2 2d 200811910 201451740 9.81 TRUE
5 9 9b 39072767 68705240 15.4 TRUE
6 11 11b 59171430 61680391 8.67 TRUE
7 14 14b 24299850 30622254 16.9 TRUE
8 16 16b 67192155 68023284 18.2 TRUE
9 19 19b 49446298 49528003 8.49 TRUE
Running under parameters:
t = 1e-04
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 2460.5
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms =
ncores = 8
ncores_nni = 8
common_diploid = TRUE
tau = 0.5
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
4921 cells
Mem used: 5.98Gb
Approximating initial clusters using smoothed expression ..
Mem used: 5.98Gb
number of genes left: 12701
running hclust...
Iteration 1
Mem used: 13.2Gb
Running HMMs on 5 cell groups..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Expression noise level: medium (0.72).
Running HMMs on 3 cell groups..
Testing for multi-allelic CNVs ..
3 multi-allelic CNVs found: 19a,2a,9a
Evaluating CNV per cell ..
Mem used: 9.27Gb
Excluding clonal LOH regions ..
All cells succeeded
Expanding allelic states..
Building phylogeny ..
Mem used: 9.58Gb
Using 23 CNVs to construct phylogeny
Aggregate function missing, defaulting to 'length'
Error in if (UPGMA_score > NJ_score) { :
missing value where TRUE/FALSE needed
Calls: run_numbat
In addition: Warning messages:
1: In log(1 - P) : NaNs produced
2: In log(1 - P) : NaNs produced
Execution halted
Hi @anderswe ,
Thanks, I will look into this.
Best, Teng
Thanks, @teng-gao!
I'll give this a go right now.
Hi @teng-gao,
Thank you for the wonderful tool that is Numbat.
I have an issue on some datasets where this same error appears. Even though I set multi_allelic=FALSE
in run_numbat
.
Here's the full log below and thanks again for your help.
Attaching SeuratObject Attaching sp Le chargement a nécessité le package : Matrix Numbat version: 1.3.0 Running under parameters: t = 1e-05 alpha = 1e-04 gamma = 20 min_cells = 50 init_k = 3 max_cost = 99.9 n_cut = 0 max_iter = 2 max_nni = 100 min_depth = 0 use_loh = auto segs_loh = None call_clonal_loh = TRUE segs_consensus_fix = None multi_allelic = FALSE min_LLR = 5 min_overlap = 0.45 max_entropy = 0.5 skip_nj = FALSE diploid_chroms = None ncores = 16 ncores_nni = 16 common_diploid = TRUE tau = 0.3 check_convergence = FALSE plot = TRUE genome = hg38 Input metrics: 333 cells Mem used: 3.92Gb Calling segments with clonal LOH Approximating initial clusters using smoothed expression .. Mem used: 3.93Gb number of genes left: 8524 running hclust... Iteration 1 Mem used: 4.22Gb Expression noise level (MSE): high (2). Consider using a custom expression reference profile. Running HMMs on 4 cell groups.. Retesting CNVs.. Retesting CNVs.. Retesting CNVs.. Retesting CNVs.. Running HMMs on 2 cell groups.. Evaluating CNV per cell .. Mem used: 4.17Gb Excluding clonal LOH regions .. All cells succeeded Building phylogeny .. Mem used: 4.18Gb Using 10 CNVs to construct phylogeny Aggregate function missing, defaulting to 'length' Erreur dans if (UPGMA_score > NJ_score) { : valeur manquante là où TRUE / FALSE est requis Appels : run_numbat De plus : Messages d'avis : 1: Dans log(1 - P) : Production de NaN 2: Dans log(1 - P) : Production de NaN Exécution arrêtée srun: error: cpu-node-56: task 0: Exited with exit code 1
Hi @MartinCastagne ,
Is it possible to share your input data? Feel free to do so via email [email protected].
Hi @anderswe @MartinCastagne
This problem should be fixed in the main
branch now (v1.3.1). Let me know if you still have the same issue.
Hi,
First, thank you for this package which seems really promising for CNV calling. I am trying to use WGS data that I have on my samples to provide CNV calls via segs_consensus_fix . My table looks like this :
When using Numbat, I get this error which translates into "Error in if (UPGMA_score > NJ_score) { : missing value where TRUE/FALSE is required" and I can't seem to find the explanation to it :
Maybe it is because my seg column looks like "1.1/1.2/1.3/etc." instead of "1.a/1.b/1.c/etc.", but I sometimes have more than 26 segments on the same chromosome.
I was about to open a new issue, and then I found this feed so I'm posting here. With the same data and parameters except for the use of segs_consensus_fix, I have no issue so it might be linked, even though the other people here don't seem to use it. Thank you very much for your time.
PS: after testing with multi_allelic = FALSE, it does the same
The error usually means you have multiple segments with the same name. Have you checked if the segments are uniquely named?
I thought I did, but it turns out that an undesired type conversion made some of them similar. Fixed it and now it seems that everything's working fine. Thank you for the suggestion !
Hello!
I am receiving this same error:
Should each seg be completely unique instead of having 1718 counts per segment?
but it seems like my columns of P are unique?
Thank you @teng-gao
@whitneyt1 Hmm. something weird is happening. Seems like the P matrix wasn't constructed correctly. would you able to send me your source data? Feel free to use my email [email protected]