performance r2-differences

r2-differences

Open strengejacke opened this issue 5 years ago • 28 comments

Moved from https://github.com/strengejacke/sjstats/issues/67 over here...

@hauselin due to the re-organization of packages, all "model-performance" related stuff will now be implemented in the performance package.

@DominiqueMakowski What do you think, can we make r2() let accept multiple model objects, and when the user passes multiple models, we can make an "anova"-like output? I.e. the r-squared values for all models, and an extra column indicating the difference(s)?

Mar 26 '19 14:03 strengejacke

I guess that would make sense as models' performance indices are very often used to compare models... I wonder about the syntax tho, do we need something implicit like r2(..., named_args) with wich which we could do r2(model1, model2, model3, named_arg=FALSE) or is it better to leave the current behaviourand accept lists of models as the first argument r2(c(model1, model2, model3), named_arg=FALSE).

This could be later extended to model_performance() (or a new compare_performance? that would open a new types of functions compare_*), that would compute and compare or possibles indices (i.e., all indices that are compatible with all the models)

Mar 27 '19 02:03 DominiqueMakowski

Currently, r2() is defined as r2 <- function(model, ...) {. I would say we just make it r2(...) (or probably r2(x, ...) ) and capture the models with list(...) or so inside the function.

Mar 27 '19 07:03 strengejacke

Agreed.

For the sake of flexibility, we might want to check the provided arguments, to see if they are (compatible) models (i.e. statistical models). Maybe we could add a small is_model in insight, and run this check on the provided arguments in r2(...) (models_to_compare <- allargsinellipsis[is_model(allargsinellipsis)] to increase stability?

Mar 27 '19 09:03 DominiqueMakowski

Sounds good! is_model() would be a long list of inherits()-commands, I guess ;-) For comparison, should we also check if all models are of the same type? (i.e. no mixing from lm, glm, coxph etc.)

Mar 27 '19 10:03 strengejacke

I think we should check the input with all_models_equal() because it makes no sense to compare r-squared values from different distributional families.

Mar 28 '19 09:03 strengejacke

Might be something for later than initial release... It requires some work esp. for more complex R2 measures like Bayes or Nakagawa.

Apr 10 '19 09:04 strengejacke

Agree, this can be improved later on

Apr 10 '19 10:04 DominiqueMakowski

I suggest implementing this in compare_performance() as R2_delta for linear models only (the only ones for which this really makes sense (As for GLMs the total "variance" on the latent scale increases with model complexity... which is weird...).

We might then also add Cohens_f2:

Apr 12 '20 15:04 mattansb

The R2 diff could nicely fit in test_performance especially if there are any CI/significance that we could derive from it 😏

I wonder if, in general, we should have a difference_performance() utility function or a difference=TRUE arg in compare_performance() that just displays the difference instead of the raw indices? (which basically sapply(compare_performance, diff))

Jan 15 '21 11:01 DominiqueMakowski

we should have a difference_performance()

No.

or a difference=TRUE arg in compare_performance() that just displays the difference instead of the raw indices?

Difference to what? Models as entered in their order? I'm not sure this is informative for most indices, is it?

Jan 27 '21 20:01 strengejacke

Olkin, Alf, and Graf have each variously developed CIs of various flavors for R2 differences.

Apr 05 '21 23:04 bwiernik

I was teaching yesterday and I was called out by students "You don't have delta-R2 available??" This is embarrassing guys....

Jul 01 '21 04:07 mattansb

Haha. Tell your students that it's honestly not a clear problem to solve when you aren't incorporating the incremental validity into your model (ala some specific SEM models) or via bootstrapping 😜

Jul 01 '21 04:07 bwiernik

We should definitely have some difference-related capabilities, and R2 seems like the best place to start

Jul 01 '21 04:07 DominiqueMakowski

∆R^2^ and ∆R and √∆R are all good statistics to that end. As a start, bootstrapping would be a great method for intervals/p. (Honestly, they are often the best estimators compared to delta-method; proper analytic are a pain).

Jul 01 '21 04:07 bwiernik

Is this a valid or useful measure at all? https://twitter.com/brodriguesco/status/1461604815759417344?s=20

Nov 20 '21 08:11 strengejacke

That tweet is just referencing the distinction between R2 and adjusted/cross-validated R2

Nov 20 '21 10:11 bwiernik

cross-validated R2?

Nov 20 '21 13:11 strengejacke

Out of sample R2 (either actually computed in a hold out sample or via leave-one-out or via an analytic approximation)

Nov 20 '21 13:11 bwiernik

https://journals.sagepub.com/doi/abs/10.1177/1094428106292901?casa_token=QnJ3HAUoBFEAAAAA:Un99_4wYO9dp8i7uM5Pkdwh3surUpUS9pLV294PciaCe8r2AWTfY14KHiLr5yxwJnve3HGEI92SM

Nov 20 '21 13:11 bwiernik

not sure why this is out of sample/cross validated, since predictors are added, but no additional / different data sets?

Nov 20 '21 14:11 strengejacke

I mean that his tweet is lamenting that in-sample R2 is positively biased. It is absolutely meaningful to compare models on R2--the solution to his concern is that an unbiased R2 estimator should be used.

Nov 20 '21 14:11 bwiernik

Ah, ok. Was a bit confused, because we were not discussing cross validated R2 here.

Nov 20 '21 14:11 strengejacke

Saw a recent tweet where @bwiernik mentioned R2 differences, I'd suggest implementing it first in a function called test_r2(), and then perhaps incorporated in test_performance()

Apr 20 '22 07:04 DominiqueMakowski

And then compare_models()

Apr 20 '22 11:04 bwiernik

compare_performance() ;-) compare_models() is an alias for compare_parameters() (and hence, located in parameters)

Apr 20 '22 12:04 strengejacke

Maybe compare_models() is better located in report, and both includes parameters and performance indices

Apr 20 '22 12:04 strengejacke

That would be less confusing

Apr 20 '22 13:04 bwiernik

performance performance copied to clipboard

r2-differences

performance
performance copied to clipboard