r2mlm icon indicating copy to clipboard operation
r2mlm copied to clipboard

"r2mlm" function not printing full variance decomposition even when L1 predictors are cluster-mean centered

Open jgguerriero opened this issue 2 years ago • 7 comments

I'm having an issue which I will describe below.

Setup

library(lmerTest)
library(r2mlm)
test_data=read.csv("Test_data.csv")

Running models (Model 1)

does_work = lmer(Y1 ~ X + (1|ID), test_data)
r2mlm(does_work)
$Decompositions
                     total     within between
fixed, within   0.01089696 0.01886563      NA
fixed, between  0.00000000         NA       0
slope variation 0.00000000 0.00000000      NA
mean variation  0.42239126         NA       1
sigma2          0.56671178 0.98113437      NA

$R2s
         total     within between
f1  0.01089696 0.01886563      NA
f2  0.00000000         NA       0
v   0.00000000 0.00000000      NA
m   0.42239126         NA       1
f   0.01089696         NA      NA
fv  0.01089696 0.01886563      NA
fvm 0.43328822         NA      NA

(Model 2)

does_not_work = lmer(Y2 ~ X + (1|ID), test_data)
r2mlm(does_not_work)
$Decompositions
                     total
fixed           0.01151214
slope variation 0.00000000
mean variation  0.39957923
sigma2          0.58890863

$R2s
         total
f   0.01151214
v   0.00000000
m   0.39957923
fv  0.01151214
fvm 0.41109137

I'm not sure why the Model 2 isn't producing a print result that shows the variance explained at each level as it does in Model 1. It is only showing the "total" variance explained. The X variable is cluster-mean centered, and the only thing that changes across the two models is the dependent variable, which is in the raw form in both cases(i.e., it still contains variance as a function of both the within and between cluster levels). I'm not sure why Y1 and Y2 are leading to different r2mlm printouts considering they are qualitatively identical as far as I can tell.

I've attached the csv file that can be used for running this code, as this seems to be a variable-specific problem. Test_data.csv

Thank you for the work you've done on this topic and in creating this package!

Joe

jgguerriero avatar Sep 23 '22 16:09 jgguerriero

Did this ever get resolved as I'm having the same issue. Null model splits at each level, but add in an additional cluster mean centered covariate and it reverts to "total" variance explained?

lmer(Y2 ~ 1 + (1|ID), test_data) works ok

lmer(Y2 ~ X + (1|ID), test_data) reverts to "total" variance

p1981thompson avatar Mar 01 '23 08:03 p1981thompson

I am having the same issue adding two or more cluster-mean centered predictors. If added separately variance is split in within and between variance, but for the model with two predictors it just shows total variance. Did I miss something?

wgfm_gg <- lmerTest::lmer(z_tab_wgfm ~ 1 + cwc_s1gg + (1|sg_id) , data=data, REML=T,control = lmerControl(optimizer = "bobyqa")) r2mlm(wgfm_gg, bargraph = TRUE)

$Decompositions total within between fixed, within 0.05713224 0.06168741 NA fixed, between 0.00000000 NA 0 slope variation 0.00000000 0.00000000 NA mean variation 0.07384279 NA 1 sigma2 0.86902497 0.93831259 NA

$R2s total within between f1 0.05713224 0.06168741 NA f2 0.00000000 NA 0 v 0.00000000 0.00000000 NA m 0.07384279 NA 1 f 0.05713224 NA NA fv 0.05713224 0.06168741 NA fvm 0.13097503 NA NA

wgfm_ka <- lmerTest::lmer(z_tab_wgfm ~ 1 + cwc_s1ukaa + (1|sg_id) , data=data, REML=T,control = lmerControl(optimizer = "bobyqa")) r2mlm(wgfm_ka, bargraph = TRUE)

$Decompositions total within between fixed, within 0.003952098 0.004210645 NA fixed, between 0.000000000 NA 0 slope variation 0.000000000 0.000000000 NA mean variation 0.061403198 NA 1 sigma2 0.934644705 0.995789355 NA

$R2s total within between f1 0.003952098 0.004210645 NA f2 0.000000000 NA 0 v 0.000000000 0.000000000 NA m 0.061403198 NA 1 f 0.003952098 NA NA fv 0.003952098 0.004210645 NA fvm 0.065355295 NA NA

wgfm_ggka <- lmerTest::lmer(z_tab_wgfm ~ 1 +cwc_s1gg + cwc_s1ukaa + (1|sg_id) , data=data, REML=T,control = lmerControl(optimizer = "bobyqa")) r2mlm(wgfm_ggka, bargraph = TRUE)

$Decompositions total fixed 0.06606656 slope variation 0.00000000 mean variation 0.06935868 sigma2 0.86457476

$R2s total f 0.06606656 v 0.00000000 m 0.06935868 fv 0.06606656 fvm 0.13542524

simowll avatar Aug 09 '23 09:08 simowll

Currently having the same issue, still no solution?

rikfor avatar Oct 12 '23 09:10 rikfor

Same issue here. It's not just changing the response variable that affects things: If I remove the cluster-mean-centered level-1 predictor but retain the cluster-means level-2 predictor, I can generate all 3 barplots for an outcome that only showed the total variance barplot initially. I've also tried re-running the model with only complete observations, which made no difference. So it seems the issue is the association between the level-1 cluster-mean-centered predictor and the outcome, at least in my case.

saichele avatar Nov 09 '23 00:11 saichele

Thanks all for reporting this. I'll have more time to look into it at the beginning of December. Apologies for any inconvenience. Has anyone tried the r2mlm_manual() function? Might just be a problem with the wrapper.

mkshaw avatar Nov 09 '23 22:11 mkshaw

Thanks for the fast reply! Your intuition was correct: Using r2mlm_manual(), I can generate total/within/between columns of estimated effects when r2mlm() only provides total. I had success with an lmer() model and also with a glmer() logistic model (but how are you approximating sigma2 for glms?).

saichele avatar Nov 10 '23 00:11 saichele

Same here, r2mlm_manual()works fine. Thank you for your reply!

rikfor avatar Nov 18 '23 15:11 rikfor