numbat icon indicating copy to clipboard operation
numbat copied to clipboard

Error in if (UPGMA_score > NJ_score)

Open teng-gao opened this issue 2 years ago • 12 comments

Error caused by providing segs_loh (ER3 in ovarian visium dataset)

Building phylogeny ..
Mem used: 0.517Gb
Using 9 CNVs to construct phylogeny
Aggregate function missing, defaulting to 'length'
Error in if (UPGMA_score > NJ_score) { : 
  missing value where TRUE/FALSE needed
Calls: run_numbat
In addition: Warning messages:
2: In log(1 - P) : NaNs produced
3: In log(1 - P) : NaNs produced
Execution halted

teng-gao avatar Nov 28 '22 01:11 teng-gao

Decided not to fix now because it changes results of NCI-N87. Fix is in branch segs_loh

teng-gao avatar Nov 28 '22 04:11 teng-gao

Hi Teng,

Sorry, still getting this error using the segs_loh branch.

Here's the full log below. Grateful for your work with numbat and for any advice you may have – thanks.

Found 9 regions with LOH/deletions.
# A tibble: 9 × 6
  CHROM seg   seg_start   seg_end snp_rate loh  
  <fct> <fct>     <int>     <int>    <dbl> <lgl>
1 1     1b    120150898 145992442     1.31 TRUE 
2 1     1d    148102046 149903320     8.98 TRUE 
3 2     2b    186694060 189031898     7.66 TRUE 
4 2     2d    200811910 201451740     9.81 TRUE 
5 9     9b     39072767  68705240    15.4  TRUE 
6 11    11b    59171430  61680391     8.67 TRUE 
7 14    14b    24299850  30622254    16.9  TRUE 
8 16    16b    67192155  68023284    18.2  TRUE 
9 19    19b    49446298  49528003     8.49 TRUE 
Running under parameters:
t = 1e-04
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 2460.5
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms = 
ncores = 8
ncores_nni = 8
common_diploid = TRUE
tau = 0.5
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
4921 cells
Mem used: 5.98Gb
Approximating initial clusters using smoothed expression ..
Mem used: 5.98Gb
number of genes left: 12701
running hclust...
Iteration 1
Mem used: 13.2Gb
Running HMMs on 5 cell groups..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Retesting CNVs..
Expression noise level: medium (0.72). 
Running HMMs on 3 cell groups..
Testing for multi-allelic CNVs ..
3 multi-allelic CNVs found: 19a,2a,9a
Evaluating CNV per cell ..
Mem used: 9.27Gb
Excluding clonal LOH regions .. 
All cells succeeded
Expanding allelic states..
Building phylogeny ..
Mem used: 9.58Gb
Using 23 CNVs to construct phylogeny
Aggregate function missing, defaulting to 'length'
Error in if (UPGMA_score > NJ_score) { : 
  missing value where TRUE/FALSE needed
Calls: run_numbat
In addition: Warning messages:
1: In log(1 - P) : NaNs produced
2: In log(1 - P) : NaNs produced
Execution halted

anderswe avatar Feb 27 '23 21:02 anderswe

Hi @anderswe ,

Thanks, I will look into this.

Best, Teng

teng-gao avatar Mar 06 '23 18:03 teng-gao

Thanks, @teng-gao!

I'll give this a go right now.

anderswe avatar Mar 06 '23 18:03 anderswe

Hi @teng-gao, Thank you for the wonderful tool that is Numbat. I have an issue on some datasets where this same error appears. Even though I set multi_allelic=FALSE in run_numbat. Here's the full log below and thanks again for your help.

Attaching SeuratObject Attaching sp Le chargement a nécessité le package : Matrix Numbat version: 1.3.0 Running under parameters: t = 1e-05 alpha = 1e-04 gamma = 20 min_cells = 50 init_k = 3 max_cost = 99.9 n_cut = 0 max_iter = 2 max_nni = 100 min_depth = 0 use_loh = auto segs_loh = None call_clonal_loh = TRUE segs_consensus_fix = None multi_allelic = FALSE min_LLR = 5 min_overlap = 0.45 max_entropy = 0.5 skip_nj = FALSE diploid_chroms = None ncores = 16 ncores_nni = 16 common_diploid = TRUE tau = 0.3 check_convergence = FALSE plot = TRUE genome = hg38 Input metrics: 333 cells Mem used: 3.92Gb Calling segments with clonal LOH Approximating initial clusters using smoothed expression .. Mem used: 3.93Gb number of genes left: 8524 running hclust... Iteration 1 Mem used: 4.22Gb Expression noise level (MSE): high (2). Consider using a custom expression reference profile. Running HMMs on 4 cell groups.. Retesting CNVs.. Retesting CNVs.. Retesting CNVs.. Retesting CNVs.. Running HMMs on 2 cell groups.. Evaluating CNV per cell .. Mem used: 4.17Gb Excluding clonal LOH regions .. All cells succeeded Building phylogeny .. Mem used: 4.18Gb Using 10 CNVs to construct phylogeny Aggregate function missing, defaulting to 'length' Erreur dans if (UPGMA_score > NJ_score) { : valeur manquante là où TRUE / FALSE est requis Appels : run_numbat De plus : Messages d'avis : 1: Dans log(1 - P) : Production de NaN 2: Dans log(1 - P) : Production de NaN Exécution arrêtée srun: error: cpu-node-56: task 0: Exited with exit code 1

MartinCastagne avatar Apr 13 '23 11:04 MartinCastagne

Hi @MartinCastagne ,

Is it possible to share your input data? Feel free to do so via email [email protected].

teng-gao avatar Apr 13 '23 14:04 teng-gao

Hi @anderswe @MartinCastagne

This problem should be fixed in the main branch now (v1.3.1). Let me know if you still have the same issue.

teng-gao avatar Apr 15 '23 16:04 teng-gao


First, thank you for this package which seems really promising for CNV calling. I am trying to use WGS data that I have on my samples to provide CNV calls via segs_consensus_fix . My table looks like this : Capture d’écran 2023-06-14 à 13 56 04 When using Numbat, I get this error which translates into "Error in if (UPGMA_score > NJ_score) { : missing value where TRUE/FALSE is required" and I can't seem to find the explanation to it : full error log Maybe it is because my seg column looks like "1.1/1.2/1.3/etc." instead of "1.a/1.b/1.c/etc.", but I sometimes have more than 26 segments on the same chromosome. I was about to open a new issue, and then I found this feed so I'm posting here. With the same data and parameters except for the use of segs_consensus_fix, I have no issue so it might be linked, even though the other people here don't seem to use it. Thank you very much for your time.

PS: after testing with multi_allelic = FALSE, it does the same

BardeChoco225 avatar Jun 14 '23 12:06 BardeChoco225

The error usually means you have multiple segments with the same name. Have you checked if the segments are uniquely named?

teng-gao avatar Jun 14 '23 22:06 teng-gao

I thought I did, but it turns out that an undesired type conversion made some of them similar. Fixed it and now it seems that everything's working fine. Thank you for the suggestion !

BardeChoco225 avatar Jun 16 '23 10:06 BardeChoco225


I am receiving this same error: image

Should each seg be completely unique instead of having 1718 counts per segment? image

but it seems like my columns of P are unique? image

Thank you @teng-gao

whitneyt1 avatar Apr 10 '24 22:04 whitneyt1

@whitneyt1 Hmm. something weird is happening. Seems like the P matrix wasn't constructed correctly. would you able to send me your source data? Feel free to use my email [email protected]

teng-gao avatar Apr 11 '24 14:04 teng-gao