pbmcapply
pbmcapply copied to clipboard
pbmclapply fails to return list where lapply, pbapply, and mclapply have no issue
Hi,
I'm trying to write custom function and having issue where pbmclapply fails to return a list where identical functions with lapply, pbapply, and mclapply function with no issues.
In this example I'm using a function from the Seurat package to reach a matrix in HDF5 file format but the same errors are persisting when replaced with fread, or read.csv, etc).
pblapply and mclapply versions that are inside of larger function but relevant portion is here given a defined list of files and sample names:
pboptions(char = "=")
if (parallel) {
raw_data_list <- mclapply(mc.cores = num_cores, 1:length(sample.names), function(i) {
h5_loc <- file.path(data.dir, file.list[1])
data <- Read10X_h5(filename = h5_loc)
})
} else {
raw_data_list <- pblapply(1:length(x = sample.names), function(i) {
h5_loc <- file.path(data.dir, file.list[1])
data <- Read10X_h5(filename = h5_loc)
})
}
names(raw_data_list) <- sample.names
return(raw_data_list)
}
If I change the mclapply section to use pbmclapply:
if (parallel) {
raw_data_list <- pbmclapply(mc.cores = num_cores, X = 1:length(sample.names), FUN = function(i) {
h5_loc <- file.path(data.dir, file.list[1])
data <- Read10X_h5(filename = h5_loc)
})
} else {
raw_data_list <- pblapply(1:length(x = sample.names), function(i) {
h5_loc <- file.path(data.dir, file.list[1])
data <- Read10X_h5(filename = h5_loc)
})
}
names(raw_data_list) <- sample.names
return(raw_data_list)
}
It appears to be working and then I get error:
Reading 10X H5 files from directory
|====================================================================| 100%, Elapsed 00:15
Error in names(raw_data_list) <- sample.names :
attempt to set an attribute on NULL
Basically it is not returning the list during the function.
However, it also gets slightly weirder. When I got this error I was trying to read in 12 files with 4 cores. If I remove files from the target directory so that it is only trying to read 5 files with 4 cores it succeeds with no issues. The files are all identical except for the file names which are sequentially ordered so it is not issue of a corrupt file or anything.
Any insights would be great because I'd really love to have progress bars for parallel versions of these functions.
Thanks! Sam