mice
mice copied to clipboard
Question: Manually pooling parameters from RE model/Degrees of freedom
Does the automatic pool() function count a random effect (e.g., a random intercept) as a parameter in calculating the degrees of freedom for a mixed model? I ask because I would like to manually pool using pool.scalar but I'm not sure if this should be counted under the "k" parameters. Also, if using pool.scalar for a multiple regression problem, you should include "k=# of total parameters in model" in calculating each pooled coefficient, correct? E.g.,
For a model with 5 parameters (4 covariates plus intercept)
Pooling 1st coefficient estimate
pool.scalar(Q, U, n = 50, k = 5)
Pooling 2nd coefficient estimate
pool.scalar(Q, U, n = 50, k = 5)
etc...
This doesn't answer your question, but have you tried using the mitml
package to pool the coefficients of your random effects model? See e.g. https://stefvanbuuren.name/fimd/sec-mlguidelines.html.
A workflow where imp
is your mids
object, model
is the random effects model, and fit
is the resulting mira
object might look like:
library(mice)
library(lme4)
library(mitml)
fit <- with(imp, lmer(model))
testEstimates(as.mitml.result(fit), extra.pars = TRUE)
Thank you for the reference. I’m pooling marginalized coefficients from glmmadaptive package, so I need to use pool.scalar, else that workflow would be fine. Whether the RE should still be counted as parameter after marginalization is another issue I need to look into, but I thought knowing what the standard workflow does would be good.
Note: You will need to install the broom.mixed
package to pool results from multilevel analyses.
Hi! I have a very related problem. I am fitting a model with the package lcmm
that does not fit into the standard MICE workflow partly because it does not accept the with()
-approach and partly because there is no tidy()
-method for models of class lcmm.
So, I will have to pool the results manually, which I try to do with the following workflow. Please notice that I use the tdc data because it replicates my dataset well and that I am aware that a raw call to mice in the tdc data will result imprecise imputations. For this I am only interested in the method for pooling the results:
library(mice)
library(lcmm)
library(tidyverse)
data("tbc")
#this is an example of the model I want to fit
m2 <- lcmm(bmi.z ~ age + sex, random = ~ age, subject = "id", data = tbc, link = "splines")
imp <- mice(tbc, maxit = 35)
com <- complete(imp, "long")
#fits separate models to each imputed dataset, stores in list:
modellist <- list()
for(i in 1:length(unique(com$.imp))){
data = filter(com, .imp == i)
modellist[[i]] = lcmm(bmi.z ~ age + sex, random = ~ age, subject = "id", data = data, link = "splines")
}
#defines tidy function for lcmm:
tidy.lcmmm <- function(x, conf.int = FALSE, conf.level = 0.95, ...) {
result <- summary(x) %>%
tibble::as_tibble(rownames = "term") %>%
dplyr::rename(estimate = coef,
std.error = Se,
statistic = Wald,
p.value = `p-value`)
if (conf.int) {
ci <- confint(x, level = conf.level)
result <- dplyr::left_join(result, ci, by = "term")
}
result
}
#tidies results:
df <- tibble(n = 1:5) %>%
mutate(models = map(modellist, ~tidy.lcmmm(.x)),
models = map(models, ~as_tibble(.x)))
df <- df %>%
unnest(cols = c(models))
#extracts and pools results for age:
estimates_age <- filter(df, term == "age") %>%
select(estimate, std.error)%>%
mutate(var = std.error^2)
pooled.results <- pool.scalar(Q = estimates_age$estimate, U = estimates_age$var, n = 306, k = 2)
I wonder if this seems like the correct way to use pool.scalar, and specifically, I wonder about k, as the model really seems to be fitting many more parameters "behind the scenes".
> estimates(m2)
age sex cholesky 1 cholesky 2 cholesky 3 I-splines1 I-splines2 I-splines3 I-splines4 I-splines5 I-splines6
0.01815061 -0.01449388 -1.03820628 0.03098324 0.08742982 -4.89337146 0.61847877 -0.61525249 2.08221454 1.65755067 1.70915255
I-splines7
0.54300731
And finally, thank you for a great package!