DelayedArray icon indicating copy to clipboard operation
DelayedArray copied to clipboard

More informative error than `the supplied seed must support extract_array()`

Open LTLA opened this issue 11 months ago • 3 comments

A recent change in the DelayedArray stack (don't know exactly where) is causing ScaledMatrix tests to break on Bioc-devel with the rather unhelpful message:

Error in `validObject(.Object)`: invalid class "ScaledMatrix" object: 
    the supplied seed must support extract_array()
Backtrace:
     ▆
  1. └─ScaledMatrix (local) spawn_extra_scenarios(100, 50) at test-mult.R:52:5
  2.   └─ScaledMatrix:::spawn_scenarios_basic(...) at test-mult.R:38:5
  3.     └─ScaledMatrix:::scale_and_center(y, ref, it) at tests/testthat/setup.R:38:13
  4.       └─ScaledMatrix::ScaledMatrix(y, center = center, scale = scale) at tests/testthat/setup.R:20:5
  5.         ├─DelayedArray::DelayedArray(...)
  6.         └─ScaledMatrix::DelayedArray(...)
  7.           └─DelayedArray::new_DelayedArray(seed, Class = "ScaledMatrix")
  8.             └─S4Vectors::new2(Class, seed = seed)
  9.               └─methods::new(...)
 10.                 ├─methods::initialize(value, ...)
 11.                 └─methods::initialize(value, ...)
 12.                   └─methods::validObject(.Object)

This doesn't make any sense because ScaledMatrix (via its seed) does, in fact, implement extract_array(). I assume there's a try being done somewhere to test whether extract_array() works, which is catching and hiding the real error message.

Session information
R Under development (unstable) (2023-11-10 r85507)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/trunk/lib/libRblas.so 
LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] DelayedArray_0.29.7   SparseArray_1.3.4     S4Arrays_1.3.4       
 [4] abind_1.4-5           IRanges_2.37.1        S4Vectors_0.41.3     
 [7] MatrixGenerics_1.15.0 matrixStats_1.2.0     BiocGenerics_0.49.1  
[10] Matrix_1.6-5          ScaledMatrix_1.11.1   testthat_3.2.1       

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5       crayon_1.5.2      cli_3.6.2         rlang_1.1.3      
 [5] glue_1.7.0        fansi_1.0.6       brio_1.1.4        grid_4.4.0       
 [9] lifecycle_1.0.4   compiler_4.4.0    waldo_0.5.2       XVector_0.43.1   
[13] rstudioapi_0.15.0 lattice_0.22-5    R6_2.5.1          utf8_1.2.4       
[17] pillar_1.9.0      magrittr_2.0.3    tools_4.4.0       withr_3.0.0      
[21] zlibbioc_1.49.0   desc_1.4.3       

LTLA avatar Feb 29 '24 07:02 LTLA

FWIW I eventually debugged it by modifying the DelayedArray source to eliminate the try() altogether in DelayedOp-class.R. Just slapping res onto the message wasn't enough, unfortunately, as I needed the traceback to figure out where it was coming from.

LTLA avatar Mar 01 '24 07:03 LTLA

This is a true positive.

The unit tests in ScaledMatrix construct a ScaledMatrixSeed object for which extract_array() doesn't work (non-compliant seed). To reproduce:

library(ScaledMatrix)

## Define the CrippledMatrix class and all the methods defined
## in ScaledMatrix/tests/testthat/test-mult.R by copy-pasting
## lines 9-59 from the file.

## Then:

cm <- new("CrippledMatrix", x=matrix(runif(50), ncol=5))
seed <- ScaledMatrixSeed(cm)  # non-compliant seed!

SM <- DelayedArray(seed)
# Error in validObject(.Object) : invalid class “ScaledMatrix” object: 
#     the supplied seed must support extract_array()

This is a validity check that I added a few weeks ago in devel.

Trying to call extract_array() directly on this seed indeed fails:

extract_array(seed, list(1:3, 1:2))
# Error in as.vector(data) : 
#   no method for coercing this S4 class to a vector

In release, SM <- DelayedArray(seed) works but the object is broken e.g. as.matrix() doesn't work on it:

as.matrix(SM)
# Error in h(simpleError(msg, call)) : 
#   error in evaluating the argument 'x' in selecting a method for function 'drop': no method for coercing this S4 class to a vector

or it can't be displayed:

show(SM)
# Error in h(simpleError(msg, call)) : 
#   error in evaluating the argument 'x' in selecting a method for function 'type': no method for coercing this S4 class to a vector

etc...

Purpose of this new validity check is to detect broken DelayedArray objects as early as possible i.e. at construction time.

Suggestions welcome to make the error message more informative.

H.

Session information
R Under development (unstable) (2024-01-02 r85764)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 23.10

Matrix products: default
BLAS:   /home/hpages/R/R-4.4.r85764/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ScaledMatrix_1.11.0

loaded via a namespace (and not attached):
 [1] zlibbioc_1.49.0       SparseArray_1.3.4     Matrix_1.6-5         
 [4] lattice_0.22-5        MatrixGenerics_1.15.0 matrixStats_1.2.0    
 [7] abind_1.4-5           S4Arrays_1.3.5        XVector_0.43.1       
[10] BiocGenerics_0.49.1   stats4_4.4.0          IRanges_2.37.1       
[13] grid_4.4.0            DelayedArray_0.29.9   compiler_4.4.0       
[16] tools_4.4.0           S4Vectors_0.41.3      crayon_1.5.2         

hpages avatar Mar 01 '24 09:03 hpages

Yes, I'm fine with the motivation behind the check. It's just that the error message itself is not very informative.

It's unfortunate that we need to return a string in the validity method, because that loses the traceback.

I guess we could just not wrap it in a try and just expose the traceback to the user. Currently the user doesn't even know who to ask to fix the problem, as the failing matrix could be arbitrarily deep in a tree of delayed operations.

LTLA avatar Mar 02 '24 23:03 LTLA