performance icon indicating copy to clipboard operation
performance copied to clipboard

Expand MLM R2 measures

Open mattansb opened this issue 4 years ago • 10 comments

Paper 1: https://doi.org/10.1037/met0000184 Paper 2: http://doi.org/10.1080/00273171.2019.1660605

These should expand on the already existent marginal and conditional R2, as the break it down by level and source of variation. (That is why it is not for effectsize, as these aren't split by effect or term, but by fixed/random/type, and so are model-wise effect sizes).

(As implemented in r2mlm)

mattansb avatar Oct 15 '20 05:10 mattansb

I was thinking of mentioning this to you all, but it appears that you beat me to it! :-) I should say I've never written a custom function before or had much experience with GitHub. But, if you all feel it would be helpful, I would be glad to try to help with the implementation!

TarandeepKang avatar Oct 19 '20 18:10 TarandeepKang

I'm going to take that thumbs up for a "yes"? If so, how would you like me to get started? Any kind of pointers would be great!

I have used a lot of coding for analyses before, but never entered any functionality to a package. I am brand-new at this, you have been warned! :-)

TarandeepKang avatar Oct 29 '20 15:10 TarandeepKang

Hey @TarandeepKang sorry it seems like we forgot to follow-up on this, as you know these last few months have been quite bumpy. If you're still interested, I would suggest starting by 1) giving a look at r2mlm, to understand the 2 steps-process (1. Get all necessary ingredients from models 2. Throw them in r2mlm_manual() and let the magic happen). Then, try re-writing step 1 as a generic function using the power of insight package which should facilitate retrieving all of the ingredients. Then we can try to understand step 2 and see if there's a need to reimplement / rewrite it to accomodate more models.

DominiqueMakowski avatar Mar 23 '21 00:03 DominiqueMakowski

I added here a working file in which I decompose the two steps, namely 1) extracting ingredients 2) calculating the indices.

So first step is I think to revise the step 1 function to replace all the internal functions by something more generalizable using insight.

DominiqueMakowski avatar Mar 23 '21 04:03 DominiqueMakowski

So extracting the ingredients (step 1) is the part that we can really generalize / simplify so that it works for more models. Then, the output of this "init" function is passed to r2mlm_manual() which does some heavy arithmetics with it.

I have isolated this "init" step that works for lme4 models:

  • Call https://github.com/easystats/performance/blob/89512a511fcee57a2960bb9e4c555c5afad4bb0b/WIP/r2mlm_test.R#L1-L13

  • Result

$Decompositions
                total              within             between
fixed, within   0.0819107586265126 0.142972810913675  NA     
fixed, between  0                  NA                 0      
slope variation 0.0377525833965782 0.0658960197411068 NA     
mean variation  0.42708856248221   NA                 1      
sigma2          0.453248095494699  0.791131169345218  NA     

$R2s
    total              within             between
f1  0.0819107586265126 0.142972810913675  NA     
f2  0                  NA                 0      
v   0.0377525833965782 0.0658960197411068 NA     
m   0.42708856248221   NA                 1      
f   0.0819107586265126 NA                 NA     
fv  0.119663342023091  0.208868830654782  NA     
fvm 0.546751904505301  NA                 NA   

For now, it still relies on the tidyverse quite a lot (which we would need to get rid off if we were to implement this R2MLM robust initalization), and it doesn't work for nlme models because model.frame() returns something different (there are still a few if/else lme4 or nlme switches that insight didn't manage to remove, for instance, for get_parameters("random") for lme4 it returns a list, but a dataframe for nlme).

(@strengejacke looking at init and utils I feel like there's quite a lot we can simplify don't you think?)

DominiqueMakowski avatar Mar 23 '21 09:03 DominiqueMakowski

As I said before, I am happy to help, just let me know when exactly you need doing. You're suggesting some steps for me above, but then it looks like you've already done them? :-)

TarandeepKang avatar Mar 23 '21 10:03 TarandeepKang

Haha yeah I gave it a quick go to prepare the terrain. Essentially the objectives would be to 1) replace all the tidyverse functions, 2) fix it for nlme. Once this is done, we can add the step2 part with the actual calculations.

DominiqueMakowski avatar Mar 23 '21 10:03 DominiqueMakowski

I'd be happy to hop in here, as one of the original r2mlm authors! I'm a bit pressed for time in the next few weeks, but mid/late-April I can implement (1) and (2).

mkshaw avatar Mar 23 '21 15:03 mkshaw

Hey @mkshaw super glad you hopped in! I was about to tag you here very soon anyway - since yesterday I finally had the occasion of giving a look to this issue ☺️

As you know we'd like to offer to users access to this feature + evaluate the possibility of generalizing it so that it can accomodate other models like glmmTMB, gamm4 and... Bayesian models? (not sure if that's possible though 😬)

As far as I understood r2mlm you separate the process in two steps, preprocessing and computation. My initial idea was to implement a preprocessing pipeline that is compatible with easystats (i.e., doesn't use tidyverse functions + uses insight to gather the model's info to simplify and streamline the code) - and hopefully by doing this, support for models like glmmTMB would naturally emerge. Then, I planned to still call your package's r2mlm::r2mlm_manual() to do the magic.

For now this whole thing is WIP and more in the "evaluation" stage (to see whether we can bring something to the table), so feel free to let us know any thoughts, issues or ideas that you might have!

DominiqueMakowski avatar Mar 24 '21 00:03 DominiqueMakowski

@DominiqueMakowski You understand the two-step process in r2mlm correctly! The plan to preprocess with easystats and then call r2mlm::r2mlm_manual() sounds good to me, and I'd love to see support for models like glmmTMB naturally emerge from that. Like I mentioned, I'd be happy to hop in and try to implement the easystats preprocessing in mid/late-April -- I've been looking for ways to familiarize myself with various easystats packages, and this seems like a good one!

mkshaw avatar Mar 24 '21 12:03 mkshaw