repr icon indicating copy to clipboard operation
repr copied to clipboard

Conversion to LaTeX of output of `colSums()` could be improved

Open krinsman opened this issue 7 years ago • 0 comments

Note: This issue was originally filed with Jupyter nbconvert, but it was determined to actually be relevant to this project, hence why it is now being filed here.

Example: Given an R data frame data with columns ever_self_employed (0 missing entries), log_tot (0 missing entries), and treated (712 missing entries), the output of running

colSums(is.na(data))

in Jupyter Notebook is converted to

    \begin{description*}
\item[ever\textbackslash{}\_self\textbackslash{}\_employed] 0
\item[log\textbackslash{}\_tot] 712
\item[treated] 0
\end{description*}

This doesn't look very much like the original output from the notebook.

  1. While \_ is necessary for the variable names with underscores to be rendered correctly in LaTeX, the insertion of \textbackslash{} is not required, and in fact changes the output from what it looked like in the original Jupyter notebook.
  2. The alignment isn't great -- the alignment should be a two-column table, with the first column right-aligned, and the second column left-aligned.
  3. There are no line breaks between the rows, even though there should be.

Here is what I propose as a fix for this specific example:

  1. Change the environment from description* to longtable (longtable instead of tabular so that the list breaks over pages when it is very long, as was necessary in another example I hade which was too long for an MWE).
  2. Explicitly wrap the text in the left column with \textbf{}.
  3. In the formatting part of the notebook, after the line \usepackage{longtable} (which is already in the preamble anyway for pandoc support), add another line setting \setlength{\LTleft}{-1cm plus -1fill} so that the resulting longtable is approximately left-adjusted (the default setting makes the longtable look centered, which is different from its adjustment in the original notebook).
  4. Use &'s and \\ to state the cells and rows of the longtable explicitly.

Here is the code I have for this specific example which seems to more faithfully reproduce the output from the original notebook:

 \usepackage{longtable} % longtable support required by pandoc >1.10
    % according to answer here: https://tex.stackexchange.com/questions/32726/center-wide-longtable-not-tabular-or-tabularx/32729
    % can be used to avoid the table being too far in the center
    \setlength{\LTleft}{-1cm plus -1fill}

...

    \begin{longtable}{rl}
\textbf{ever\_self\_employed} & 0 \\
\textbf{log\_tot} &  712 \\
\textbf{treated} & 0
\end{longtable}

Admittedly the \setlength{\LTleft}{-1cm plus -1fill} seems to make the table a little too far to the left in some cases.

krinsman avatar May 07 '18 16:05 krinsman