report icon indicating copy to clipboard operation
report copied to clipboard

output-style from report

Open strengejacke opened this issue 4 years ago • 14 comments

I think we can / should improve the output-style from reporting model tables. Currently, it is:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter         | Coefficient |   SE | CI_low | CI_high |     t | df_error |    p | Std_Coefficient |    Fit
#> --------------------------------------------------------------------------------------------------------------
#> (Intercept)       |        1.50 | 0.19 |   1.12 |    1.87 |  7.93 |      146 | 0.00 |            1.50 |       
#> Petal.Length      |        1.93 | 0.14 |   1.66 |    2.20 | 13.96 |      146 | 0.00 |            1.93 |       
#> Speciesversicolor |       -1.93 | 0.23 |  -2.40 |   -1.47 | -8.28 |      146 | 0.00 |           -1.93 |       
#> Speciesvirginica  |       -2.56 | 0.33 |  -3.21 |   -1.90 | -7.74 |      146 | 0.00 |           -2.56 |       
#>                   |             |      |        |         |       |          |      |                 |       
#> AIC               |             |      |        |         |       |          |      |                 | 106.23
#> BIC               |             |      |        |         |       |          |      |                 | 121.29
#> R2                |             |      |        |         |       |          |      |                 |   0.84
#> R2 (adj.)         |             |      |        |         |       |          |      |                 |   0.83
#> RMSE              |             |      |        |         |       |          |      |                 |   0.33

Created on 2020-02-14 by the reprex package (v0.3.0)

Things that can be improved

  1. CIs can be collapsed into one column, like in model_parameters().
  2. Column Std_Coefficient is identical to Coefficient
  3. My main concern are the fit indices, which are additional rows for an additional column. I think we can change the stlye here, having
  • top left: headline, maybe formula, or "linear regression" or so
  • top right: fit indices
  • bottom: coefficient table

For the layout of 3) I have something like the stata output in mind (without table for sums of squares)

hqdefault

or

image

strengejacke avatar Feb 14 '20 08:02 strengejacke

Mmh I thought that the output was using the same pipeline that model_parameters() so that it would automatically format for instance the CI column 🤔 but it's true that I got lost in the endless calls of methods. Happy that your fresh and hawk-like eye finds out things to improve

DominiqueMakowski avatar Feb 14 '20 09:02 DominiqueMakowski

Some points have been resolved:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter            | Coefficient |   SE |             CI |     t |  df |      p | Coefficient (std.) |    Fit
#> ---------------------------------------------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] | 34.72 | 146 | < .001 |               1.50 |       
#> Petal.Length         |        0.90 | 0.06 | [ 0.78,  1.03] | 13.96 | 146 | < .001 |               1.93 |       
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] | -8.28 | 146 | < .001 |              -1.93 |       
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] | -7.74 | 146 | < .001 |              -2.56 |       
#>                      |             |      |                |       |     |        |                    |       
#> AIC                  |             |      |                |       |     |        |                    | 106.23
#> BIC                  |             |      |                |       |     |        |                    | 121.29
#> R2                   |             |      |                |       |     |        |                    |   0.84
#> R2 (adj.)            |             |      |                |       |     |        |                    |   0.83
#> RMSE                 |             |      |                |       |     |        |                    |   0.33

Created on 2020-09-18 by the reprex package (v0.3.0)

Now 3) is still remaining. And we should add the CI-level to the column name as well...

@DominiqueMakowski Maybe we should just copy the print() method from model_parameters() to report as well?

strengejacke avatar Sep 18 '20 14:09 strengejacke

@DominiqueMakowski Maybe we should just copy the print() method from model_parameters() to report as well?

Yes, the only thing that we wanted to do and that was implemented very early on, is to color code the values. We had for instance green/red for the coefficient depending on the direction, and white/yellow for CIs excluding 0 and p-values < 0.1 (and pd > 95%). But we could bake that directly into parameters down the line as well

DominiqueMakowski avatar Sep 19 '20 01:09 DominiqueMakowski

I would suggest putting the fit indices in rows below the table, in the Coefficient column (with SEs, p values, etc. as relevant). Something like this:

library(report)
library(magrittr)
data(iris)

lm(Sepal.Length ~ Petal.Length + Species, data=iris) %>%
  report() %>%
  table_long() 
#> Parameter            | Coefficient |   SE |             CI |     t |  df |      p | Coefficient (std.) 
#> ------------------------------------------------------------------------------------------------------
#> (Intercept)          |        3.68 | 0.11 | [ 3.47,  3.89] | 34.72 | 146 | < .001 |    .50        
#> Petal.Length         |        0.90 | 0.06 | [ 0.78,  1.03] | 13.96 | 146 | < .001 |   1.93       
#> Species [versicolor] |       -1.60 | 0.19 | [-1.98, -1.22] | -8.28 | 146 | < .001 |  -1.93       
#> Species [virginica]  |       -2.12 | 0.27 | [-2.66, -1.58] | -7.74 | 146 | < .001 |  -2.56    
#> ------------------------------------------------------------------------------------------------------                 
#> AIC                  |   106.23    |      |                |       |     |        |                    
#> BIC                  |   121.29    |      |                |       |     |        |                    
#> R2                   |     0.84    |      |                |       |     |        |                    
#> R2 (adj.)            |     0.83    |      |                |       |     |        |                    
#> RMSE                 |     0.33    |      |                |       |     |        |                    

That is compact and permits reporting of intervals with the same format as coefficients. It would also play nicely with RMarkdown output.

bwiernik avatar Apr 06 '21 14:04 bwiernik

Just a thought: the vertical separator lines look neat and is a good visual helper to separate the cells. However, I have a small laptop screen, and those lines sometimes make the output much wider, so that the output that normally fits in my screen ends up having to wrap at the end of my console, messing the whole output and making the thing unreadable because things don't align anymore. That said, I feel like that might be more of a "me" problem (and I should probably get an external monitor). Combining the 95% CI was a good call in making it narrower though.

rempsyc avatar Sep 03 '22 15:09 rempsyc

I think you can pass the sep argument via the print() methods.

library(parameters)
m <- lm(Sepal.Width ~ Species, data = iris)

model_parameters(m)
#> Parameter            | Coefficient |   SE |         95% CI | t(147) |      p
#> ----------------------------------------------------------------------------
#> (Intercept)          |        3.43 | 0.05 | [ 3.33,  3.52] |  71.36 | < .001
#> Species [versicolor] |       -0.66 | 0.07 | [-0.79, -0.52] |  -9.69 | < .001
#> Species [virginica]  |       -0.45 | 0.07 | [-0.59, -0.32] |  -6.68 | < .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

print(model_parameters(m), sep = "")
#> Parameter           Coefficient  SE        95% CIt(147)     p
#> -------------------------------------------------------------
#> (Intercept)                3.430.05[ 3.33,  3.52] 71.36< .001
#> Species [versicolor]      -0.660.07[-0.79, -0.52] -9.69< .001
#> Species [virginica]       -0.450.07[-0.59, -0.32] -6.68< .001
#> 
#> Uncertainty intervals (equal-tailed) and p-values (two-tailed) computed
#>   using a Wald t-distribution approximation.

Created on 2022-09-03 with reprex v2.0.2

I'm not sure which IDE you use, but with vscode you can easily create shortcuts to show/hide/expand panels. I usually use a 3-column layout (also on my laptop), and either expand console as needed, or switch console- and editor panel from side-by-side to top-bottom. But anyway, we should ensure that all arguments that we have in export_table() are passed down to that function from report-print() methods

Here examples with a very minimized window:

https://user-images.githubusercontent.com/26301769/188278001-9180cadd-971b-4404-9bff-9a38d8589867.mp4

https://user-images.githubusercontent.com/26301769/188278013-22700838-7a2c-4a68-82fc-c657f9f384ac.mp4

strengejacke avatar Sep 03 '22 15:09 strengejacke

Ok let's take the extreme case of correlation(mtcars) |> summary(). Actually a reprex is less useful here because GitHub makes an horizontal scrolling bar instead of wrapping. And the print(sep = "") trick is very cool but seems like it doesn't work with correlation for some reason.

library(correlation)
correlation(mtcars) |>
  summary() |> 
  print(sep = "")
#> # Correlation Matrix (pearson-method)
#> 
#> Parameter |    carb |    gear |       am |       vs |     qsec |       wt |     drat |       hp |     disp |      cyl
#> ---------------------------------------------------------------------------------------------------------------------
#> mpg       |  -0.55* |    0.48 |   0.60** |   0.66** |     0.42 | -0.87*** |  0.68*** | -0.78*** | -0.85*** | -0.85***
#> cyl       |   0.53* |   -0.49 |   -0.52* | -0.81*** |   -0.59* |  0.78*** | -0.70*** |  0.83*** |  0.90*** |         
#> disp      |    0.39 |  -0.56* |   -0.59* | -0.71*** |    -0.43 |  0.89*** | -0.71*** |  0.79*** |          |         
#> hp        | 0.75*** |   -0.13 |    -0.24 | -0.72*** | -0.71*** |   0.66** |    -0.45 |          |          |         
#> drat      |   -0.09 | 0.70*** |  0.71*** |     0.44 |     0.09 | -0.71*** |          |          |          |         
#> wt        |    0.43 |  -0.58* | -0.69*** |   -0.55* |    -0.17 |          |          |          |          |         
#> qsec      | -0.66** |   -0.21 |    -0.23 |  0.74*** |          |          |          |          |          |         
#> vs        |  -0.57* |    0.21 |     0.17 |          |          |          |          |          |          |         
#> am        |    0.06 | 0.79*** |          |          |          |          |          |          |          |         
#> gear      |    0.27 |         |          |          |          |          |          |          |          |         
#> 
#> p-value adjustment method: Holm (1979)

Created on 2022-09-03 by the reprex package (v2.0.1)

And thanks for the tip of resizing my console to full screen width. It works in RStudio too (which I use), and I might have used that before and then just stopped bothering with it haha. I should probably learn the keyboard shortcuts for that though...

rempsyc avatar Sep 03 '22 15:09 rempsyc

Adding another example of tabular output styles: https://www.statsmodels.org/stable/index.html

image

strengejacke avatar Feb 02 '23 15:02 strengejacke

Probably unpopular opinion, but I think the vertical lines (in our printing method) are ugly. I'm fully aware that this is a strong bias because I've fully introjected APA style conventions. But I still think our outputs should actually look more like this 😈

library(bruceR)

model = lm(Temp ~ Month + Day + Wind + Solar.R, data=airquality)
print_table(model)
#> ──────────────────────────────────────────────
#>              Estimate    S.E.      t     p    
#> ──────────────────────────────────────────────
#> (Intercept)    68.770 (4.391) 15.662 <.001 ***
#> Month           2.225 (0.441)  5.047 <.001 ***
#> Day            -0.084 (0.070) -1.194  .234    
#> Wind           -1.003 (0.176) -5.695 <.001 ***
#> Solar.R         0.027 (0.007)  3.991 <.001 ***
#> ──────────────────────────────────────────────

Created on 2023-02-02 with reprex v2.0.2

https://psychbruce.github.io/bruceR/reference/print_table.html

Edit: I'm also OK with tabular output.

rempsyc avatar Feb 02 '23 16:02 rempsyc

Yes, feel free to modify the output style! We have a certain style in, say, parameters, and I don't see why we need to copy this in report as well. It's ok to make it different here. See https://easystats.github.io/insight/reference/export_table.html, you can change the sep argument to remove vertical lines.

strengejacke avatar Feb 02 '23 18:02 strengejacke

and I don't see why we need to copy this in report as well. It's ok to make it different here.

Agreed. And APA-like style as a default makes sense I'd say

DominiqueMakowski avatar Feb 02 '23 20:02 DominiqueMakowski

The main reason for the vertical lines is that they are valid markdown tables. We should prioritize making tables that become formatted properly when compiled to html, word, or pdf

bwiernik avatar Feb 02 '23 23:02 bwiernik

We should prioritize making tables that become formatted properly when compiled to html, word, or pdf

Good point. But where or for whom should we prioritize such tables? For the pkgdown website? It seems not to be working as expected there: https://easystats.github.io/report/articles/report.html#grouped-dataframes

Or do you mean for users that compile reports to html, word, or pdf for other reasons? What is the added benefit from using html_document: df_print: kable for all those formats which will apply to all data frames?

Regardless, the table pasted as text (not code) does format correctly, but how do you get the same result from within a code chunk to these formats? It does not seem to pick it up automatically. Even using results = "asis", I'm not getting the same result.

report table

rempsyc avatar Feb 03 '23 00:02 rempsyc

For markdown, we have the format argument in export_table(). See https://easystats.github.io/insight/articles/display.html

strengejacke avatar Feb 03 '23 06:02 strengejacke