parameters icon indicating copy to clipboard operation
parameters copied to clipboard

make `parameters()` show fixed effects restricted to 0

Open SchmidtPaul opened this issue 3 years ago • 15 comments

Sorry if I am missing something, but I can't find a way to include the fixed effects solutions that are set to 0 due to restrictions/constraint. Here is an example where SAS does it: image

As far as I can tell, paramters::parameters() (and stats::coef()) will always drop them from the table:

levels(PlantGrowth$group)
#> [1] "ctrl" "trt1" "trt2"

m <- lm(weight ~ group, PlantGrowth)

parameters::parameters(m)
#> Registered S3 method overwritten by 'parameters':
#>   method                         from      
#>   format.parameters_distribution datawizard
#> Parameter    | Coefficient |   SE |        95% CI | t(27) |      p
#> ------------------------------------------------------------------
#> (Intercept)  |        5.03 | 0.20 | [ 4.63, 5.44] | 25.53 | < .001
#> group [trt1] |       -0.37 | 0.28 | [-0.94, 0.20] | -1.33 | 0.194 
#> group [trt2] |        0.49 | 0.28 | [-0.08, 1.07] |  1.77 | 0.088
#> 
#> Uncertainty intervals (equal-tailed) and p values (two-tailed) computed
#>   using a Wald t-distribution approximation.

Created on 2022-05-17 by the reprex package (v2.0.1)

Yet, I sometimes want an additional line group [ctrl] with just a 0 for Coefficient and NA for everything else in my parameters table. Is there a way to do this with {parameters}?

SchmidtPaul avatar May 17 '22 11:05 SchmidtPaul

You mean adding an additional row for the reference level of factors?

strengejacke avatar May 17 '22 12:05 strengejacke

I can see the value of an include_reference_level argument to include the reference for factors

So it would show up something like:


#> Parameter         | Coefficient |   SE |        95% CI | t(27) |      p
#> -----------------------------------------------------------------------
#> (Intercept)       |        5.03 | 0.20 | [ 4.63, 5.44] | 25.53 | < .001
#> group [ref: ctrl] |             |      |               |       |
#> group [trt1]      |       -0.37 | 0.28 | [-0.94, 0.20] | -1.33 | 0.194 
#> group [trt2]      |        0.49 | 0.28 | [-0.08, 1.07] |  1.77 | 0.088
#> 

bwiernik avatar May 17 '22 12:05 bwiernik

I would leave all of columns aside from Parameter blank. Tricky thing might be detecting when there is a reference level (eg, only for treatment or SAS contrasts with an Intercept included)

bwiernik avatar May 17 '22 12:05 bwiernik

This would only work for treatment coding, so that would need to be tested.

Personally, I don't see the merit of adding all these 0s?

mattansb avatar May 17 '22 12:05 mattansb

Well I'd be happy with an include_reference_level argument that leads to all blanks and no 0s, too. (I just found 0s to be intuitive when teaching and I am used to it from SAS.)

SchmidtPaul avatar May 17 '22 12:05 SchmidtPaul

Personally, I don't see the merit of adding all these 0s?

This is mainly for completeness. Alternatively, you could add a footnote indicating the reference levels. And it's not that unusual to add the "estimate" (i.e. 0 for linear, or 1 for OR etc.), sometimes there's just "Ref." in the estimate column.

Here's an example of a recent paper: image

The idea is to have a table that is completely self-explaining, so you don't need to read the methods section to remember all levels of categorical variables.

strengejacke avatar May 17 '22 12:05 strengejacke

But then the intercept(s) are omitted, to have a table of slopes only?

This is less intuitive to me (I would prefer a clear label of the intercept instead), but I can understand why someone would want this (especially if they are accustomed to it).

mattansb avatar May 17 '22 12:05 mattansb

But then the intercept(s) are omitted, to have a table of slopes only?

No, not necessarily. This is more common in my field, where we're more interested in the strength of the associations instead of the predicted outcome. That's why we often omit the intercept in tables.

strengejacke avatar May 17 '22 12:05 strengejacke

I just saw in two other of my recent papers, intercepts are included in the tables ;-)

strengejacke avatar May 17 '22 12:05 strengejacke

My (controversial) view: This will add code complexity and convey little information (literally 0s and dots). I understand why we may want to add empty rows for presentation purposes in "finished" regression tables intended for publication, but that's not quite the job of parameters.

vincentarelbundock avatar Jun 25 '22 10:06 vincentarelbundock

parameters is the package that provides regression parameters tables that people display in publications

bwiernik avatar Jun 25 '22 10:06 bwiernik

Haha, yeah, sorry. I guess I always only ever use/see the markdown in console, so I lost sight ;)

vincentarelbundock avatar Jun 25 '22 11:06 vincentarelbundock

Actually, we already do something similar for grouping parameters: https://easystats.github.io/parameters/articles/model_parameters_print.html#group-parameters

strengejacke avatar Jun 26 '22 22:06 strengejacke

Maybe we just add a special option "reference" to that argument that adds the reference to factors?

And maybe allow a subset of factors or combining reference and grouping by, if the argument is given a list with a slot called "reference", the reference formatting is applied to the stated factors?

bwiernik avatar Jun 26 '22 22:06 bwiernik

I found that {broom.helpers} does what I was looking for and {ggally} is making use of that, too:

library(dplyr)

m <- lm(weight ~ group, PlantGrowth)

broom.helpers::tidy_plus_plus(model = m) %>% 
  select(term, contrasts:conf.high)
#> # A tibble: 3 x 12
#>   term     contrasts contrasts_type reference_row label n_obs estimate std.error
#>   <chr>    <chr>     <chr>          <lgl>         <chr> <dbl>    <dbl>     <dbl>
#> 1 groupct~ contr.tr~ treatment      TRUE          ctrl     10    0        NA    
#> 2 grouptr~ contr.tr~ treatment      FALSE         trt1     10   -0.371     0.279
#> 3 grouptr~ contr.tr~ treatment      FALSE         trt2     10    0.494     0.279
#> # ... with 4 more variables: statistic <dbl>, p.value <dbl>, conf.low <dbl>,
#> #   conf.high <dbl>

GGally::ggcoef_model(
  model = m,
  add_reference_rows = TRUE,
  categorical_terms_pattern = "{level} (ref: {reference_level})"
)

Created on 2022-07-12 by the reprex package (v2.0.1)

SchmidtPaul avatar Jul 12 '22 07:07 SchmidtPaul

See examples (and maybe further discussion) here: https://github.com/easystats/parameters/pull/902

strengejacke avatar Sep 11 '23 13:09 strengejacke

Use add_reference = TRUE in the print() method.

library(parameters)
data("fish")
m1 <- glmmTMB::glmmTMB(
  count ~ child + camper + zg + (1 | ID),
  ziformula = ~ child + camper + (1 | persons),
  data = fish,
  family = glmmTMB::truncated_poisson()
)
print(model_parameters(m1, effects = "fixed"), add_reference = TRUE)
#> # Fixed Effects
#> 
#> Parameter   | Log-Mean |   SE |         95% CI |     z |      p
#> ---------------------------------------------------------------
#> (Intercept) |     1.41 | 0.18 | [ 1.06,  1.75] |  8.02 | < .001
#> child       |    -0.53 | 0.12 | [-0.77, -0.29] | -4.40 | < .001
#> camper [0]  |     0.00 |      |                |       |       
#> camper [1]  |     0.58 | 0.10 | [ 0.39,  0.78] |  5.93 | < .001
#> zg          |     0.13 | 0.04 | [ 0.05,  0.21] |  3.17 | 0.002 
#> 
#> # Zero-Inflation
#> 
#> Parameter   | Log-Odds |   SE |         95% CI |     z |      p
#> ---------------------------------------------------------------
#> (Intercept) |    -0.39 | 0.65 | [-1.67,  0.89] | -0.60 | 0.551 
#> child       |     2.05 | 0.31 | [ 1.45,  2.66] |  6.63 | < .001
#> camper [0]  |     0.00 |      |                |       |       
#> camper [1]  |    -1.01 | 0.32 | [-1.64, -0.37] | -3.12 | 0.002
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald z-distribution approximation.
#> 
#> The model has a log- or logit-link. Consider using `exponentiate =
#>   TRUE` to interpret coefficients as ratios.

data(mtcars)
mtcars$gear <- as.factor(mtcars$gear)
m <- glm(vs ~ wt + gear, data = mtcars, family = "binomial")
print(model_parameters(m, exponentiate = TRUE, drop = "(Intercept)"), add_reference = TRUE)
#> Parameter | Odds Ratio |   SE |        95% CI |     z |     p
#> -------------------------------------------------------------
#> wt        |       0.07 | 0.09 | [0.00,  0.52] | -2.05 | 0.040
#> gear [3]  |       1.00 |      |               |       |      
#> gear [4]  |       3.21 | 3.98 | [0.27, 41.36] |  0.94 | 0.348
#> gear [5]  |       0.03 | 0.07 | [0.00,  1.47] | -1.41 | 0.159
#> 
#> Uncertainty intervals (profile-likelihood) and p-values (two-tailed)
#>   computed using a Wald z-distribution approximation.

Created on 2023-09-11 with reprex v2.0.2

strengejacke avatar Sep 11 '23 18:09 strengejacke