clustermq
clustermq copied to clipboard
When using clustermq in a foreach, data.table loses their data.table class although is.data.table returns TRUE in workers
In the example below, we can see that a data.table running under clustermq start behaving like a data.frame. I've tried to fix it by copying, calling setDT, or as.data.table, but that doesn't fix it.
library(foreach)
library(data.table)
library(clustermq)
dt_test <- data.table(a = rep(1, 4), b = rep(2, 4), c = rep(3, 4), d = rep(4, 4))
clustermq::register_dopar_cmq(n_jobs = 1, template = list(partition = "my_partition"), fail_on_error = FALSE)
# this doesn't return rows, but columns instead, as if the data.table lost it's class
foreach(i = 1:nrow(dt_test), .verbose = FALSE, .export = c("dt_test")) %dopar% {return(dt_test[i])}
# while running in sequential mode returns rows (as a data.table should)
foreach(i = 1:nrow(dt_test), .verbose = FALSE, .export = c("dt_test")) %do% {return(dt_test[i])}
# also is.data.table returns true
foreach(i = 1:nrow(dt_test), .verbose = FALSE, .export = c("dt_test")) %dopar% {return(is.data.table(dt_test))}
I've tracked this bug being introduced between clustermq v0.8.95 and v0.9.0. I've tried data.table versions from 13.4 to 15.2 and can reproduce it with all of them, so I think the issue is independent of data.table.
It even happens if you run clustermq with "Running sequentially ('LOCAL') ... "
Versions of all things:
R 4.3.3, clustermq >= 0.9.0, data.table 1.15.2, foreach 1.5.2 clustermq backend: SLURM (but I think it's independent of that).