facets icon indicating copy to clipboard operation
facets copied to clipboard

Error "replacement has 0 rows", caused by dipLogR=numeric(0)

Open ypradat opened this issue 2 years ago • 2 comments

Dear developer,

I am processing large numbers of WES samples with facets and have observed the same error in a very small number of the samples namely an R error

Error in `$<-.data.frame`(`*tmp*`, "ocn", value = numeric(0)) :
  replacement has 0 rows, data has XX

where XX is any number. I investigated by running the code chunk by chunk and the error is raised at the line

out$ocn <- 2^(1 + out$cnlr.median - dipLogR)

of the internal function fitcnf0. For the problematic samples, dipLogR has value numeric(0) which R does not like at all in math operations. I traced the code back to where dipLogR is computed i.e in the function findDiploidLogR.

I include here the value of out0 before proceeding.

> out0
  ¦segclust num.mark nhet  cnlr.median       mafR
1         1       12    2 -3.011918512         NA
2         2     1356  131 -1.798816173 0.87040920
3         3     1113  129 -1.672719808 0.50768199
4         4     2591  169 -1.540583510 0.38480972
5         5       95    0 -1.449920926         NA
6         6     3549  260 -1.375430640 1.07821394
7         7     8371  617 -1.214081713 1.62760232
8         8    23391 1735 -1.153622814 1.75910412
9         9     3813  338 -1.065265524 0.38562833
10       10     9147  507 -0.986037822 1.68275186
11       11     2276  188 -0.986037822 1.00952637
12       12      590   27 -0.872059667 0.39128127
13       13     9273  668 -0.646692979 0.39995403
14       14   137519 9330 -0.497578555 0.35753481
15       15     6959  533 -0.497578555 2.91293673
16       16      111    2 -0.497578555         NA
17       17    48236 2667 -0.430391724 0.36672818
18       18    87418 5958 -0.378755987 0.35694070
19       19     1181  139 -0.305467079 0.37138545
20       20    13660   53 -0.305467079         NA
21       21    17858 1597 -0.203132374 0.36138407
22       22       60    3 -0.112683016         NA
23       23    28336 1927 -0.045396996 0.55155616
24       24    17580 1304  0.001305458 0.53605083
25       25      349   59  1.106909438 0.04739713
26       26        9    1  1.745496039         NA

Because there is no segment among the segments above with a value of mafR below 0.025, the threshold is raised to 0.05. Only line 25 of out0 satisfies this condition and therefore bsegs takes the value 25. Also, ocnlevels takes the value

> ocnlevels
 [1] -3.011918512 -1.798816173 -1.672719808 -1.540583510 -1.449920926 -1.375430640 -1.214081713 -1.153622814 -1.065265524 -0.986037822
[11] -0.872059667 -0.646692979 -0.497578555 -0.430391724 -0.378755987 -0.305467079 -0.203132374 -0.112683016 -0.045396996  0.001305458
[21]  1.106909438  1.745496039

after running

levels <- unique(out0$cnlr.median)

Continuing through the function, dipLogR takes the intermediate value 1.106909438 after running

} else {                                                       
¦ ¦ # make sure bsegs is not empty                             
¦ ¦ if (length(bsegs) == 0) {                                  
¦ ¦ ¦ ¦ # if no balanced segs set dipLogR at the median of cnlr
¦ ¦ ¦ ¦ dipLogR <- median(cnlr)                                
¦ ¦ ¦ ¦ nbal <- 0                                              
¦ ¦ } else {                                                   
¦ ¦ ¦ ¦ dipLogR <- cnlr.median[bsegs]                          
¦ ¦ ¦ ¦ nbal <- num.mark[bsegs]                                
¦ ¦ }                                                          
}                                                              

Advancing through the code, not1plus1 takes the value FALSE and therefore we arrive at where cn2logR is computed

# find deviance for each ocnlevel                                                       
# ocn levels cannot be any lower than lr4-1                                       
ocnlevels0 <- ocnlevels[ocnlevels > dipLogR[1]-1 & ocnlevels < dipLogR[1]]        
dev1 <- sapply(ocnlevels0, facets:::dlrdev, dipLogR[1], out1)                     
[...]                                                
cn2logR <- ocnlevels0[which.min(dev1)]                                            

The problem here is that there is not a single value in the vector ocnlevels that satisfies ocnlevels > dipLogR[1]-1 & ocnlevels < dipLogR[1] which makes ocnlevels0 (and subsequently cn2logR) take the value numeric(0).

However, changing the strictly inferior in ocnlevels[ocnlevels > dipLogR[1]-1 & ocnlevels < dipLogR[1]] to less than or equal as follows ocnlevels[ocnlevels > dipLogR[1]-1 & ocnlevels <= dipLogR[1]]

makes ocnlevels0 take the length-1 vector value [1] 1.106909 and subsequently cn2LogR the value 1.106909, thereby avoiding the error I described at the beginning.

Is the strictly inferior needed in ocnlevels[ocnlevels > dipLogR[1]-1 & ocnlevels < dipLogR[1]] or is it ok to replace it by a simple less than or equal to? I know that we often use interchangeably < and <= with no consequence at all but here is an example where it makes a difference.

Best, Yoann

ypradat avatar Oct 17 '22 14:10 ypradat

can you share a data set that will trigger the dipLogR=numeric(0) issue you are getting?

veseshan avatar Oct 19 '22 18:10 veseshan

Hello,

I am sorry I can't find again the pileup matrix that led to the error I described above. I'll make sure to keep it in case I come across this error again. I thought that the table out0 I provided to you in my previous message would be enough for you to figure out what was going on.

Best, Yoann

ypradat avatar Oct 20 '22 13:10 ypradat