papeR
papeR copied to clipboard
knitr::kable() summary output gives variable names *and* labels
The result of summarize_numeric
contains the variable names as row names if labels = TRUE
. These row names are printed in markdown output via the default knitr::kable()
, which is unintended. A workaround is to call knitr::kable() with argument row.names = FALSE
. In contrast, the print.xtable.summary
method automatically hides the row names.
Illustration:
library("papeR")
#> Lade nötiges Paket: car
#> Lade nötiges Paket: carData
#> Lade nötiges Paket: xtable
#>
#> Attache Paket: 'papeR'
#> The following object is masked from 'package:utils':
#>
#> toLatex
data(Orthodont, package = "nlme")
labels(Orthodont, "distance") <- "Fissure distance (mm)"
print(sum0 <- summarize(Orthodont))
#> Factors are dropped from the summary
#> N Mean SD Min Q1 Median Q3 Max
#> 1 distance 108 24.02 2.93 16.5 22 23.75 26 31.5
#> 2 age 108 11.00 2.25 8.0 9 11.00 13 14.0
print(sum1 <- summarize(Orthodont, labels = TRUE))
#> Factors are dropped from the summary
#> N Mean SD Min Q1 Median Q3 Max
#> 1 Fissure distance (mm) 108 24.02 2.93 16.5 22 23.75 26 31.5
#> 2 age 108 11.00 2.25 8.0 9 11.00 13 14.0
## xtable() uses *either* variable names *or* labels
xtable(sum0)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb 5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#> \toprule
#> & N & & Mean & SD & & Min & Q1 & Median & Q3 & Max \\
#> \cmidrule{2-2} \cmidrule{4-5} \cmidrule{7-11}
#> distance & 108 & & 24.02 & 2.93 & & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\
#> age & 108 & & 11.00 & 2.25 & & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\
#> \bottomrule
#> \end{tabular}
#> \end{center}
xtable(sum1)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb 5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#> \toprule
#> & N & & Mean & SD & & Min & Q1 & Median & Q3 & Max \\
#> \cmidrule{2-2} \cmidrule{4-5} \cmidrule{7-11}
#> Fissure distance (mm) & 108 & & 24.02 & 2.93 & & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\
#> age & 108 & & 11.00 & 2.25 & & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\
#> \bottomrule
#> \end{tabular}
#> \end{center}
## however, kable() gives both
knitr::kable(sum0)
N | Mean | SD | Min | Q1 | Median | Q3 | Max | |||
---|---|---|---|---|---|---|---|---|---|---|
distance | 108 | 24.02 | 2.93 | 16.5 | 22 | 23.75 | 26 | 31.5 | ||
age | 108 | 11.00 | 2.25 | 8.0 | 9 | 11.00 | 13 | 14.0 |
knitr::kable(sum1) # gives variable names *and* labels
N | Mean | SD | Min | Q1 | Median | Q3 | Max | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
distance | Fissure distance (mm) | 108 | 24.02 | 2.93 | 16.5 | 22 | 23.75 | 26 | 31.5 | ||
age | age | 108 | 11.00 | 2.25 | 8.0 | 9 | 11.00 | 13 | 14.0 |
The reason for this is that the data.frame()
setup in summarize_numeric()
https://github.com/hofnerb/papeR/blob/9e79d5b1c73e8a2db69449b19104bc65926b776d/R/summarize.R#L129-L133
picks up names(variable.labels)
as default row names.
rownames(sum0)
#> [1] "1" "2"
rownames(sum1)
#> [1] "distance" "age"
Are these row names used anywhere else?
For markdown output, it would be convenient to have no row names, i.e., force row.names = NULL
in the above data.frame
setup. Otherwise I need to manually call knitr::kable()
with row.names = FALSE
for every summary in my R Markdown document.