simr
simr copied to clipboard
'long vectors not supported yet' error
I'm trying to use simr with a dataset and model from my experiments and it's generating long vectors not supported yet: subassign.c:1818
errors for every attempted model fit.
Here is a reproducible sample with a subset of the data prepped_mdat.csv:
mdat <- read.csv('prepped_mdat.csv')
mdl <- lmer('s_mfq_tot ~ MFQtminus1 + TimeBetween +(TimeBetween | SDAN) + (1|vid)', data=mdat)
summary(mdl)
powerSim(mdl, nsim=20)
Which produces:
confidence interval):
0.00% ( 0.00, 16.84)
Test: unknown test
Effect size for MFQtminus1 is 0.77
Based on 20 simulations, (0 warnings, 20 errors)
alpha = 0.05, nrow = 201
Time elapsed: 0 h 0 m 0 s
nb: result might be an observed power calculation
All of the errors are 'long vectors not supported yet: subassign.c:1818'
.
When I run the example in the documentation, it appears to work fine:
subj <- factor(1:10)
class_id <- letters[1:5]
time <- 0:2
group <- c("control", "intervention")
subj_full <- rep(subj, 15)
class_full <- rep(rep(class_id, each=10), 3)
time_full <- rep(time, each=50)
group_full <- rep(rep(group, each=5), 15)
covars <- data.frame(id=subj_full, class=class_full, treat=group_full, time=factor(time_full))
## Intercept and slopes for intervention, time1, time2, intervention:time1, intervention:time2
fixed <- c(5, 0, 0.1, 0.2, 1, 0.9)
## Random intercepts for participants clustered by class
rand <- list(0.5, 0.1)
## residual variance
res <- 2
model <- makeLmer(y ~ treat*time + (1|class/id), fixef=fixed, VarCorr=rand, sigma=res, data=covars)
sim_treat <- powerSim(model, nsim=100, test = fcompare(y~time))
Which produces:
Power for model comparison, (95% confidence interval):
51.00% (40.80, 61.14)
Test: Likelihood ratio
Comparison to y ~ time + [re]
Based on 100 simulations, (0 warnings, 0 errors)
alpha = 0.05, nrow = 150
Time elapsed: 0 h 0 m 12 s
Additionally, here's my session info in case it's useful:
R version 4.1.1 (2021-08-10)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /gpfs/gsfs11/users/MBDU/nielsond/fhist/env/lib/libopenblasp-r0.3.17.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] simr_1.0.5 lmerTest_3.1-3 lme4_1.1-27.1 Matrix_1.3-4
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 purrr_0.3.4 splines_4.1.1
[4] haven_2.4.3 lattice_0.20-44 carData_3.0-4
[7] colorspace_2.0-2 vctrs_0.3.8 generics_0.1.0
[10] mgcv_1.8-36 utf8_1.2.2 rlang_0.4.11
[13] nloptr_1.2.2.2 pillar_1.6.2 foreign_0.8-81
[16] glue_1.4.2 readxl_1.3.1 plyr_1.8.6
[19] binom_1.1-1 lifecycle_1.0.0 stringr_1.4.0
[22] munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0
[25] zip_2.2.0 rio_0.5.27 forcats_0.5.1
[28] RLRsim_3.1-6 parallel_4.1.1 pbkrtest_0.5.1
[31] curl_4.3.2 fansi_0.4.2 broom_0.7.9
[34] Rcpp_1.0.7 backports_1.2.1 scales_1.1.1
[37] plotrix_3.8-2 abind_1.4-5 ggplot2_3.3.5
[40] hms_1.1.0 stringi_1.7.4 openxlsx_4.2.4
[43] dplyr_1.0.7 numDeriv_2016.8-1.1 grid_4.1.1
[46] magrittr_2.0.1 tibble_3.1.4 tidyr_1.1.3
[49] crayon_1.4.1 car_3.0-11 pkgconfig_2.0.3
[52] MASS_7.3-54 ellipsis_0.3.2 data.table_1.14.0
[55] minqa_1.2.4 iterators_1.0.13 R6_2.5.1
[58] boot_1.3-28 nlme_3.1-153 compiler_4.1.1
Any help you can provide would be greatly appreciated.
I'm afraid I can't reproduce the problem, it runs without error for me.
We might be able to pin it down by going step-by-step, i.e.:
y <- doSim(mdl)
mdl2 <- doFit(y, mdl)
doTest(mdl2)
I see that the printout says "unknown test" which is a little suspicious. Maybe if the test was specified explicitly it might work? i.e. doTest(mdl2, fixed("MFQtminus1", "kr"))
.
It looks like it's failing in the fit:
mdat <- read.csv('prepped_mdat.csv')
mdl <- lmer('s_mfq_tot ~ MFQtminus1 + TimeBetween +(TimeBetween | SDAN) + (1|vid)', data=mdat)
summary(mdl)
#powerSim(mdl, nsim=20)
y <- doSim(mdl)
mdl2 <- doFit(y, mdl)
produces
Error in newCall[["formula"]][[2]] <- responseForm :
long vectors not supported yet: subassign.c:1818
Since it works for you, do you think it might be a dependency/version issue, what was the session info you ran this under?
That's quite strange, not where I would have expected it to break down.
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_New Zealand.1252 LC_CTYPE=English_New Zealand.1252
[3] LC_MONETARY=English_New Zealand.1252 LC_NUMERIC=C
[5] LC_TIME=English_New Zealand.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] simr_1.0.5 lme4_1.1-27.1 Matrix_1.3-4
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 xfun_0.24 purrr_0.3.4 splines_4.1.1 haven_2.4.1
[6] lattice_0.20-44 carData_3.0-4 vctrs_0.3.8 generics_0.1.0 htmltools_0.5.1.1
[11] yaml_2.2.1 mgcv_1.8-36 utf8_1.2.2 rlang_0.4.11 nloptr_1.2.2.2
[16] pillar_1.6.1 foreign_0.8-81 glue_1.4.2 DBI_1.1.1 readxl_1.3.1
[21] plyr_1.8.6 binom_1.1-1 lifecycle_1.0.0 stringr_1.4.0 cellranger_1.1.0
[26] zip_2.2.0 evaluate_0.14 knitr_1.33 rio_0.5.27 forcats_0.5.1
[31] RLRsim_3.1-6 parallel_4.1.1 pbkrtest_0.5.1 curl_4.3.2 fansi_0.5.0
[36] broom_0.7.8 Rcpp_1.0.7 backports_1.2.1 plotrix_3.8-1 abind_1.4-5
[41] hms_1.1.0 digest_0.6.27 stringi_1.7.3 openxlsx_4.2.4 dplyr_1.0.7
[46] grid_4.1.1 tools_4.1.1 magrittr_2.0.1 tibble_3.1.3 tidyr_1.1.3
[51] crayon_1.4.1 car_3.0-11 pkgconfig_2.0.3 MASS_7.3-54 ellipsis_0.3.2
[56] data.table_1.14.0 assertthat_0.2.1 minqa_1.2.4 rmarkdown_2.9 iterators_1.0.13
[61] R6_2.5.0 boot_1.3-28 nlme_3.1-152 compiler_4.1.1
>
Tried it on the cloud too with no errors:
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8
[5] LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 LC_PAPER=C.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] simr_1.0.5 lme4_1.1-27.1 Matrix_1.3-3
loaded via a namespace (and not attached):
[1] tidyselect_1.1.1 purrr_0.3.4 splines_4.1.0 haven_2.4.3 lattice_0.20-44 carData_3.0-4
[7] vctrs_0.3.8 generics_0.1.0 mgcv_1.8-35 utf8_1.2.2 rlang_0.4.11 nloptr_1.2.2.2
[13] pillar_1.6.2 foreign_0.8-81 glue_1.4.2 readxl_1.3.1 plyr_1.8.6 binom_1.1-1
[19] lifecycle_1.0.0 stringr_1.4.0 cellranger_1.1.0 zip_2.2.0 rio_0.5.27 forcats_0.5.1
[25] RLRsim_3.1-6 pbkrtest_0.5.1 curl_4.3.2 parallel_4.1.0 fansi_0.5.0 broom_0.7.9
[31] Rcpp_1.0.7 backports_1.2.1 plotrix_3.8-2 abind_1.4-5 hms_1.1.0 stringi_1.7.4
[37] openxlsx_4.2.4 dplyr_1.0.7 grid_4.1.0 tools_4.1.0 magrittr_2.0.1 tibble_3.1.4
[43] crayon_1.4.1 car_3.0-11 tidyr_1.1.3 pkgconfig_2.0.3 MASS_7.3-54 ellipsis_0.3.2
[49] data.table_1.14.0 minqa_1.2.4 iterators_1.0.13 R6_2.5.1 boot_1.3-28 nlme_3.1-152
[55] compiler_4.1.0
Huh, looks like it's lmerTest's fault. (btw, I added some rows to the test data to avoid singular fit warnings: prepped_mdat.csv)
library("lmerTest")
library("lme4")
library("simr")
mdat <- read.csv('prepped_mdat.csv')
mdl <- lmer('s_mfq_tot ~ MFQtminus1 +(1 | SDAN)', data=mdat)
summary(mdl)
print(powerSim(mdl, nsim=20))
Produces 20 errors again:
Power for predictor 'MFQtminus1', (95% confidence interval):
0.00% ( 0.00, 16.84)
Test: unknown test
Effect size for MFQtminus1 is 0.67
Based on 20 simulations, (0 warnings, 20 errors)
alpha = 0.05, nrow = 501
Time elapsed: 0 h 0 m 0 s
nb: result might be an observed power calculation
But if I detach lmerTest
it works fine:
detach("package:lmerTest")
mdl <- lmer('s_mfq_tot ~ MFQtminus1 +(1 | SDAN)', data=mdat)
summary(mdl)
print(powerSim(mdl, nsim=20)
Power for predictor 'MFQtminus1', (95% confidence interval):
100.0% (83.16, 100.0)
Test: Kenward Roger (package pbkrtest)
Effect size for MFQtminus1 is 0.67
Based on 20 simulations, (0 warnings, 0 errors)
alpha = 0.05, nrow = 501
Time elapsed: 0 h 0 m 4 s
nb: result might be an observed power calculation
Digging into things a little bit more, I took apart the doFit function and it looks like the issue is that the call object returned
by getCall
on an lmerModLmerTest
object has a character string in its formula slot, whereas, getCall
on an lmerMod
object returns a formula object in its formula slot.
library("lmerTest")
library("lme4")
library("simr")
mdat <- read.csv('prepped_mdat.csv')
fit <- lmer('s_mfq_tot ~ MFQtminus1 +(1 | SDAN)', data=mdat)
summary(fit)
y <- doSim(fit)
responseName <- formula(fit)[[2]]
# cbind response for binomial
if(as.character(responseName)[1] == "cbind") {
responseForm <- responseName
responseName <- responseName[[2]]
if(is.matrix(y)) y <- y[, 1]
} else {
if(!is.character(responseName)) responseName <- deparse(responseName)
responseName <- make.names(responseName)
responseForm <- as.symbol(responseName)
}
newData <- getData(fit)
newData[[responseName]] <- y
newCall <- getCall(fit)
print(class(fit)) # "lmerModLmerTest"
print(getCall(fit)) # lmer(formula = "s_mfq_tot ~ MFQtminus1 +(1 | SDAN)", data = mdat)
print(class(getCall(fit))) # "call"
print(newCall[["formula"]]) # "s_mfq_tot ~ MFQtminus1 +(1 | SDAN)"
print(class(newCall[["formula"]])) # "character"
print("now detach lmerTest")
detach("package:lmerTest")
mdat <- read.csv('prepped_mdat.csv')
fit <- lmer('s_mfq_tot ~ MFQtminus1 +(1 | SDAN)', data=mdat)
summary(fit)
y <- doSim(fit)
responseName <- formula(fit)[[2]]
# cbind response for binomial
if(as.character(responseName)[1] == "cbind") {
responseForm <- responseName
responseName <- responseName[[2]]
if(is.matrix(y)) y <- y[, 1]
} else {
if(!is.character(responseName)) responseName <- deparse(responseName)
responseName <- make.names(responseName)
responseForm <- as.symbol(responseName)
}
newData <- getData(fit)
newData[[responseName]] <- y
newCall <- getCall(fit)
print(class(fit)) # "lmerMod"
print(getCall(fit)) # lmer(formula = s_mfq_tot ~ MFQtminus1 + (1 | SDAN), data = mdat)
print(class(getCall(fit))) # "call"
print(newCall[["formula"]]) # s_mfq_tot ~ MFQtminus1 + (1 | SDAN) <environment: 0x55556d15c1e0>
print(class(newCall[["formula"]])) # "formula"
I primarily work in Python, so I don't really know how I'd go about fixing the issue, but for the time being, it seems like avoiding lmerTest
is an acceptable workaround.
Thanks for tracking down the problem. I'll be adding some more unit testing around lmerTest
for the next release.
I can reproduce the same error with doTest
for lmerModLmerTest
class objects (from fitting with lmerTest
), which is resolved when refitting the model w/ lme4
and using lmerMod
class object with doTest
.
Thanks @Shotgunosine for identifying the source of this bug – saved me unknown number of hours trying to diagnose what was going on :pray: