papeR icon indicating copy to clipboard operation
papeR copied to clipboard

knitr::kable() summary output gives variable names *and* labels

Open bastistician opened this issue 6 years ago • 0 comments

The result of summarize_numeric contains the variable names as row names if labels = TRUE. These row names are printed in markdown output via the default knitr::kable(), which is unintended. A workaround is to call knitr::kable() with argument row.names = FALSE. In contrast, the print.xtable.summary method automatically hides the row names.

Illustration:

library("papeR")
#> Lade nötiges Paket: car
#> Lade nötiges Paket: carData
#> Lade nötiges Paket: xtable
#> 
#> Attache Paket: 'papeR'
#> The following object is masked from 'package:utils':
#> 
#>     toLatex
data(Orthodont, package = "nlme")
labels(Orthodont, "distance") <- "Fissure distance (mm)"

print(sum0 <- summarize(Orthodont))
#> Factors are dropped from the summary
#>              N    Mean   SD    Min Q1 Median Q3  Max
#> 1 distance 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2      age 108   11.00 2.25    8.0  9  11.00 13 14.0
print(sum1 <- summarize(Orthodont, labels = TRUE))
#> Factors are dropped from the summary
#>                           N    Mean   SD    Min Q1 Median Q3  Max
#> 1 Fissure distance (mm) 108   24.02 2.93   16.5 22  23.75 26 31.5
#> 2                   age 108   11.00 2.25    8.0  9  11.00 13 14.0

## xtable() uses *either* variable names *or* labels
xtable(sum0)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  distance & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}
xtable(sum1)
#> NOTE: Output requires \usepackage{booktabs} in your preamble.
#> \begin{center}
#> % latex table generated in R 3.4.4 by xtable 1.8-3 package
#> % Tue Feb  5 17:20:54 2019
#> \begin{tabular}{lrrrrrrrrrr}
#>   \toprule
#>    & N &   & Mean & SD &   & Min & Q1 & Median & Q3 & Max \\ 
#>     \cmidrule{2-2}  \cmidrule{4-5} \cmidrule{7-11}
#>  Fissure distance (mm) & 108 &  & 24.02 & 2.93 &  & 16.50 & 22.00 & 23.75 & 26.00 & 31.50 \\ 
#>   age & 108 &  & 11.00 & 2.25 &  & 8.00 & 9.00 & 11.00 & 13.00 & 14.00 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{center}

## however, kable() gives both
knitr::kable(sum0)
N Mean SD Min Q1 Median Q3 Max
distance 108 24.02 2.93 16.5 22 23.75 26 31.5
age 108 11.00 2.25 8.0 9 11.00 13 14.0
knitr::kable(sum1)  # gives variable names *and* labels
N Mean SD Min Q1 Median Q3 Max
distance Fissure distance (mm) 108 24.02 2.93 16.5 22 23.75 26 31.5
age age 108 11.00 2.25 8.0 9 11.00 13 14.0

The reason for this is that the data.frame() setup in summarize_numeric() https://github.com/hofnerb/papeR/blob/9e79d5b1c73e8a2db69449b19104bc65926b776d/R/summarize.R#L129-L133 picks up names(variable.labels) as default row names.

rownames(sum0)
#> [1] "1" "2"
rownames(sum1)
#> [1] "distance" "age"

Are these row names used anywhere else?

For markdown output, it would be convenient to have no row names, i.e., force row.names = NULL in the above data.frame setup. Otherwise I need to manually call knitr::kable() with row.names = FALSE for every summary in my R Markdown document.

bastistician avatar Feb 05 '19 16:02 bastistician