`BPCells::svds` crashes for `threads > 1`
Hello,
My R sessions crashes if I input threads here in svds more than 1 as integer. Is this supposed to be the proper way of using it.
svd <- BPCells::svds(normdata, k=30, threads = 2L)
pr.data <- BPCells::multiply_cols(svd$v, svd$d)
Hi @Artur-man , Thanks again with using BPCells! I'm unable to reproduce the crash on my end. Can you give a little more information on the data being used in normdata, and any information on your system setup?
I'm succeeding with a small matrix on my end.
mat <- get_demo_mat()
mat <- log1p(multiply_cols(mat, 1/Matrix::colSums(mat))) %>% write_matrix_dir(tempdir(), overwrite = TRUE)
svd <- svds(mat, k = 30, threads = 2L)
multiply_cols(svd$v, svd$d)
Hey @immanuelazn,
I also did a quick check with write_matrix_dir and it appears this specifically happens for write_matrix_hdf5. Also, I think the problem also occurs with dummy data.
So, this works:
library(BPCells)
m <- matrix(data = seq_len(100*100), nrow=100) |>
as("IterableMatrix")
file <- tempdir()
m <- write_matrix_dir(m, dir = file)
svd <- BPCells::svds(m, k=30, threads=2L)
but this doesn't work and crashes the R session:
library(BPCells)
m <- matrix(data = seq_len(100*100), nrow=100) |>
as("IterableMatrix")
file <- tempfile(fileext = ".h5")
m <- write_matrix_hdf5(m, path = file, group = "name")
svd <- BPCells::svds(m, k=30, threads=2L)
Here is the log of the crash when I run R on terminal:
*** caught segfault ***
*** caught segfault ***
address 0x1, cause 'invalid permissions'
address 0x1, cause 'invalid permissions'
Error in svds_cpp(it, k, solver_params[["ncv"]], solver_params[["maxitr"]], :
bad value
Error in svds_cpp(it, k, solver_params[["ncv"]], solver_params[["maxitr"]], :
bad value
R(6798,0x1727bf000) malloc: Double free of object 0x19b91e000
R(6798,0x1727bf000) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6
I've been able to reproduce this on a Mac and have made a few observations:
- Using the homebrew hdf5 library on MacOS hits the crash
- Using the BPCells binary from R-universe on MacOS hits the crash
- Using a custom-compiled HDF5 with the thread safety option turned on works on MacOS without a crash
- This example does not crash when run on linux, even without a thread safe HDF5 build
My current conclusions are:
- This crash is specifically due to HDF5 not being thread safe
- Technically we should probably treat HDF5 reads as not thread safe everywhere, even though we've only seen crashes on Macs
Obviously, saving the input file to a non-HDF5 source is the immediate workaround, but in general it should be impossible to cause these kinds of session crashes with BPCells. For the fix, I'd propose:
- Make BPCells check if the
H5_HAVE_THREADSAFEmacro is defined from HDF5 (vast majority of builds will be non threadsafe) - When building against a non-thread-safe HDF5 library, put all access to hdf5 API calls behind a global lock
- Ideally, figure out a way to print a performance warning when someone runs a multi-threaded operation with HDF5 file inputs to let them know the data reads are being forced to happen single-threaded.