papaja
papaja copied to clipboard
Support custom effect sizes
It has been my stated goal from the start that I would like to do no computing if possible---just formatting and typesetting. Unfortunately, many analysis objects do not include all desired effect size estimates. Because of this, many methods report only unstandardized effect sizes or end up doing some computations. Some recent internal changes provide an avenue to change this. It would be nice if users could pass a function to apa_print()
that is used internally to calculate the desired effect size and confidence interval and include it in the output (or a precalculated data.frame
). This way, it would be much easier to support, for example, standardized effect sizes. Here's a rough prototype:
library("dplyr")
apa_print.afex_aov <- function(x, es = effectsize::omega_squared) {
aov_table <- apa_print(x, es = NULL)$table
if(is.function(es)) {
aov_es_table <- es(x)
}
aov_es_table <- tidy_es(aov_es_table)
aov_es_table <- aov_es_table %>%
mutate(estimate = printnum(estimate, digits = 3, gt1 = FALSE)) %>%
rowwise() %>%
mutate(conf.int = print_confint(conf.low, conf.high, digits = 3, gt1 = FALSE)) %>%
select(estimate, conf.int) %>%
label_variables(
estimate = "\\hat\\eta^2_p"
, conf.int = "90% CI"
)
aov_table$estimate <- aov_es_table$estimate
aov_table$conf.int <- aov_es_table$conf.int
# Assembly of the strings for reporting follows
# ...
}
tidy_es <- function(x) {
UseMethod("tidy_es", x)
}
tidy_es.default <- function(x) {
tryCatch(
broom::tidy(x)
, error = function(e) {
stop("The object returned by the function passed to 'es' could not be tidied, that is converted to a data.frame with columns 'estimate', 'conf.low', and 'conf.high'. Please supply a function that returns a data.frame of this structure or pass such a data.frame directly.")
}
)
}
tidy_es.data.frame <- function(x) {
es_cols <- colnames(x)
if(
!"estimate" %in% es_cols |
!"conf.low" %in% es_cols |
!"conf.high" %in% es_cols
) {
stop("Effect size data.frame must contain the colums 'estimate', 'conf.low', and 'conf.high'")
}
x
}
tidy_es.effectsize_table <- function(x) {
insight::standardize_names(x, style = "broom")
}
tidy_es.parameters_model <- function(x) {
tidy_es.effectsize_table(x)
}
tidy_es.parameters_distribution <- function(x) {
tidy_es.effectsize_table(x)
}
Check out the lm
methods for user interface considerations regarding measures of R squared and standardized coefficients.
I've just pushed a draft on branch add-custom-effect-sizes
. I ran into some obstacles that need further consideration. However, first the good news:
I considerably "flattened" our ANOVA processing: I could retire print_anova()
and the arrange_anova.anova()
method by switching to canonize()
, beautify()
, and glue_apa_results()
. This will allow us to further simplify apa_print.anova()
, but I first postponed this step.
Regarding the effectsize package:
- I started calculating effect sizes with effectsize package for
afex_aov
objects, because these are best supported by effectsize. -
effectsize::eta_squared()
does not return effect sizes for intercept terms. - Many ANOVA outputs that we support are not supported by effectsize. Therefore, we will have to adopt defaults method-by-method and we will have to keep our effect-size calculation code. Maybe we only support effectsize for
afex_aov
objects?
There are still many loose ends here (with regard to function defaults, etc.), which is why I didn't create a PR. Think it's worth a thought if this is really a fruitful avenue, especially because we cannot abandon our old eta-squared code. We will get another dependency with the only benefit that there will be CIs for some effect-size measures.
To document our discussion:
- We want to add support for
effectsize
and similar functions as an option. The default will remain as it is now, returning unstandardized effect sizes as available from the models. - This implies that
effectsize
will be a suggested package, and we will try to make it possible to use a global option to change the default to standardized effect sizes viaeffectsize
if the package is available. - There is the issue that
summary
-output is currently not supported ineffectsize
, but we will try to get around this issue by using the call to get the original object (seelm
/glm
code). - I have initiated a discussion regarding the effect size calculation for intercept terms in ANOVA model: https://github.com/easystats/effectsize/issues/156
Moving forward with integrating the effectsize package, I think it will be the best solution to delegate calculating estimates together with confidence intervals to a specialized function in all possible instances, which can then be from the effectsize package. For some statistical models, however, we will most likely want original-scale estimates as a default, and therefore we need a function that simply returns original-scale estimates together with confidence intervals. This doesn't seem to be the scope of the effectsize package (or I overlooked something), so maybe we integrate it in papaja.
Here is a draft for such a function:
original_scale <- function(x, ...) {
UseMethod("original_scale")
}
original_scale.default <- function(x, conf.int = .95, ...) {
if(isTRUE(conf.int)) {
conf.int <- .95
} else {
conf.int <- as.numeric(conf.int[[1L]])
}
coefs <- stats::coefficients(x)
y <- data.frame(
term = names(coefs)
, estimate = unname(coefs)
, stringsAsFactors = FALSE
)
if(as.logical(conf.int)) {
conf_out <- stats::confint(x, level = conf.int, ...)
y$conf.low <- conf_out[, 1L, drop = TRUE]
y$conf.high <- conf_out[, 2L, drop = TRUE]
}
class(y) <- c("papaja_estimates", "data.frame")
attr(y, "estimate") <- "original_scale"
attr(y, "conf.level") <- conf.int
y
}
test_lm <- lm(yield ~ N, npk)
test_glm <- glm(as.numeric(N) ~ yield, npk, family = poisson)
original_scale(test_lm)
original_scale(test_glm)
That's for adding custom effect sizes! I'd like to use this functionality with emmGrid
objects. Has this been added yet?