pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

using tables with vertical lines for generated latex

Open Sildra opened this issue 12 years ago • 41 comments
trafficstars

Reader : Markdown Writer : LaTeX Current version : Pandoc 1.11.1

Pandoc lacks the possibility of inserting tables with vertical lines. I've been tweaking the generated tex from a markdown grid such as :

+---------+-------------+---------+
|         |  c1         |  c2     |
+=========+=============+=========+ 
| e1      |  e2         |  10     |
+---------+-------------+---------+
| e4      |  e5         |  15     |
+---------+-------------+---------+

and it appears that the generated latex have a few issues :

  • the numbers are enclosed with \begin{verbatim} and \end{verbatim} breaking the current font style (minor issue, not related to the issue)
  • each line ends with \noalign{\medskip} which is breaking the vertical lines for the grid (major issue for the implementation of the issue)

And it appears that the only element missing for the implementation of the grid is the pipe in \begin{longtable}[c]{@[]|l|l|l|@[]}

Sildra avatar Jul 18 '13 11:07 Sildra

  1. I don't think the vertical lines should be included by default, and I won't change pandoc to do that. Maybe eventually some syntax should be added to indicate "vertical line here", but there's no representation of that in pandoc's current Table element.
  2. I don't see the verbatims in my output. What version of pandoc are you using?
  3. The \noalign{\medskip} is there to help prevent the table from looking too crammed together.

jgm avatar Jul 18 '13 22:07 jgm

  1. There is 4 representations of tables in pandoc and the grid_tables representation is the one that fit best with the vertical lines in a table (for me I expected this representation to have vertical lines when first using it). For me the behavior of the generated LaTeX fit with the simple_tables representation but not with the grid_tables.
  2. For the verbatims, I create a cell with numbers only (version 1.11.1).
  3. You can use //[0.5em] instead of \noalign{\medskip}.

Sildra avatar Jul 29 '13 15:07 Sildra

What is the correct way to add vertical lines when it is required?

jacobus avatar Mar 25 '14 12:03 jacobus

There is currently no way to add vertical lines, other than postprocessing pandoc's latex output. Use perl or a similar tool to change

\begin{longtable}[c]{@{}lr@{}}

for example to

\begin{longtable}[c]{@{}l|r@{}}

jgm avatar Mar 25 '14 16:03 jgm

@jgm John, what are your thoughts on using custom LaTeX commands (or environments), defined in the template, that map semantically to structures in pandoc-types? That way a lot of the LaTeX formatting logic can be segregated to the template, instead of being locked away inside the binary. The tradeoff obviously would be giving up on "standard" LaTeX output, so it wouldn't be an easy decision. I'm just wondering if you've considered it at all.

timtylin avatar Mar 25 '14 19:03 timtylin

+++ Tim Lin [Mar 25 14 12:16 ]:

[1]@jgm John, what are your thoughts on using custom LaTeX commands (or environments), defined in the template, that map semantically to structures in pandoc-types? That way a lot of the LaTeX formatting logic can be segregated to the template, instead of being locked away inside the binary. The tradeoff obviously would be giving up on "standard" LaTeX output, so it wouldn't be an easy decision. I'm just wondering if you've considered it at all.

I have considered it (and it has been discussed on pandoc-discuss). My current view is still that there's value in having pandoc emit standard LaTeX, even though I can also see the advantages in the "custom" (and more customizable) approach.

But it's an issue worth reconsidering from time to time.

jgm avatar Mar 25 '14 19:03 jgm

The time passed and issue is coming back. What are your thoughts about it now?

@jgm You wrote:

Maybe eventually some syntax should be added to indicate "vertical line here", but there's no representation of that in pandoc's current Table element.

Does it change?

kubabuczak avatar Apr 22 '16 15:04 kubabuczak

No change.

jgm avatar Apr 22 '16 16:04 jgm

This is a seriously strange perspective. I use pandoc to make my lecture notes. The times when you want a table without vertical lines are extremely rare. Given the fact that we have so many representations of tables, it's perfectly natural to expect some of them (at least) to be used to allow vertical separators. Instead, I'm going to have to make a custom command in the filter to specify table layout and hope it works.

bluddy avatar Jan 26 '17 20:01 bluddy

can't comment on whether it should be available or not, but here's a hack if you don't want to do it manually every time

https://gist.github.com/svenevs/41a1a434a055adcee56bd1f0374fa254

svenevs avatar Mar 25 '17 07:03 svenevs

@bluddy

This is a seriously strange perspective. I use pandoc to make my lecture notes. The times when you want a table without vertical lines are extremely rare.

Read the manual of the booktabs package in LaTeX first. The author argued why this is wrong, and why it is extremely rare to need vertical rules.

If one is not using LaTeX, I have no problem to understand why people do not apply good typography. But as long as LaTeX is involved, I think it is very important to defend for good typography, which is what LaTeX is all about.

On the other hand, from the perspective of writing lecture notes, I can understand it is a much more time-constrained task than publishing, and hence might not have a lot of time to design the tables to looks more professional (a table didn't designed to be publishable might not works well without vertical rules). So my point is not that vertical rules should never be applied, but we should understand that having vertical rules in LaTeX is actually a "serious strange perspective", not the other way around. (Albeit there's a need to do serious strange things because of other constraints like available time, source of the table, etc.)

From my experience in pandoc's output, I agree that often the time it is the table output that one has to worry about. This is originated from the complexity of tons of different table designed with a different kind of info. There's a huge amount of packages that handle tables in LaTeX, exactly because no one package excels at every kinds of table. The complexity originates from the need of publishing quality and the constraint of a page, which are both not required in any other output formats (except for ConTeXt), especially those based on HTML. (HTML table are usually ugly. My personal CSS actually emulate the kind of tables from booktabs that's without vertical rules.)

Lastly, just to point out a common phenomenon: tools like pandoc that emphasize on automatic conversion takes away the burden of the author to worry about the output but the content. While this is true, this might leads one to think they do not need to design the content with the output in mind. This is not true, and in practice can cause a lot of problems (especially for tables, other elements like images also have similar problem, but the complexity of "publishable" table make this a bigger problem for tables).

ickc avatar Mar 25 '17 23:03 ickc

@ickc The author is using a pathological example to make his point. It doesn't represent most use cases.

hasufell avatar Mar 26 '17 00:03 hasufell

@hasufell,

Perhaps I wasn't clear, I edited and added that the previous message is to @bluddy. I was talking specially on his use case.

And then I also think what I said is general enough for any use cases. If you read carefully you would see for example "if one is not using LaTeX...", and "my point is not that vertical rules should never be applied", etc.

Lastly, my whole argument is based on the quote. I was saying the statement "This is a seriously strange perspective. I use pandoc to make my lecture notes. The times when you want a table without vertical lines are extremely rare." is wrong.

ickc avatar Mar 26 '17 00:03 ickc

@hasufell, oh, do you mean the author of booktabs? Sorry I misunderstood your statement.

Yes, that's why other packages exist. And while booktabs is not the definitive package for tables, it is the best one and many are following its recommendation.

And I would say when people has use case for vertical rule, either they have no respect on typography, or they have no control over the complexity of the table (say, given by another colleagues, or required by the audiences).

Edit: I have symphathy to the later group. And think they should be supported. But for the first group, I wish it doesn't exist. Technology should be used to augmented our capability (e.g. arts, design, etc.) but not to trash them. [Some people think that the convenience of technology outweights the ugliness of the design fom it. And by the way, that's why Apple has such a huge success (not to say it doesn't have it's own problem) because it is rare in the industry where "arts meets technology".]

Edit2: The following would be useful on the topic of good table typography in LaTeX: http://tex.stackexchange.com/questions/40542/why-not-use-vertical-lines-in-a-tabular . In short, other kinds of good typography exists, but it seems given the limitation of LaTeX, booktabs is the "only" good solution (at least any that keep mixing vertical and horizontal rules is not).

ickc avatar Mar 26 '17 00:03 ickc

Well, I think that vertical borders in table out-of-the-box would be very nice feature. Few flavors of Markdown, e.g. GitMarkdown, already implemented it, so it actually in-demand feature. Every time when I need to create nice table in PDF, I must plunge into LaTeX code. I have to literally paste a chunks of tables-latex-code, it decreases overall readability of raw markdown document.

banderlog avatar Jan 24 '18 09:01 banderlog

Well, I think that vertical borders in table out-of-the-box would be very nice feature.

It is nice to have the option but I would vote for it to be optional but not the default.

ickc avatar Jan 26 '18 02:01 ickc

Hi guys, I just stumbled across this and since I am having a similar problem, I just wanted to give my brief thoughts on this topic. Anyhow, the problem which is making this so hard to implement is the fact that there is practically no separation between style and content in latex tables. For this reason it would be practical if it were possible to define the style without changing the content. After some research i stumbled across the pgfplotstable package which allows exactly this. For example, the following code compiles to a table with grid lines around it:

\pgfplotstabletypeset[
col sep=&,
row sep=\\,
every head row/.style={
before row={
\hline
},
after row=\hline\hline,
},
column type/.add={|}{},
every last column/.style={ column type/.add={}{|}},
every last row/.style={
after row=\hline},
string type
] {
hello&world\\
This is&a test\\
}

By removing the parts with \hline in them one can remove the horizontal lines and by removing the lines which style the columns one may remove the vertical lines. It is also possible to add support for longtables through:

begin table=\begin{longtable},
end table=\end{longtable},

The largest downside I see to this approach is the fact that we would then depend on another package, however, I believe that this might be worth it. As I said, these are just some ideas that I had when looking into this briefly and they are still very rudimentary. What do you think?

cheers, project-repo

project-repo avatar Jun 14 '19 11:06 project-repo

@project-repo we try not to depend on unusual packages, and to generate some fairly standard LaTeX, so that's a strike against this proposal.

jgm avatar Jun 14 '19 15:06 jgm

According to https://ctan.org/pkg/pgfplotstable, pgfplotstable is included in pgfplots, which is a fairly standard packages.

But I can't find an easy way to find a list of included packages in common LaTeX distributions. (e.g. what's included in TeXLive directly from TeXLive, and also the texlive and texlive-full from deb, etc.)

Should we somehow compiled a list of packages that's common enough which can be used in pandoc for future reference? (Or is such a list already known?)

ickc avatar Jun 14 '19 22:06 ickc

@ickc You seem to be right in that there seems to be no way to get the packages which are commonly installed by default. On Arch Linux the pgfplots package is bundled in texlive-pictures, whereas most of the packages required by pandoc are present in texlive-core. However, I noticed that there already exists the option for pandoc to use microtype and upquote if they are available. Maybe we could do something similar. I totally agree with you that we should be more specific on which packages we want to include so that we avoid becoming to dependeny heavy. However, I believe it will be difficult to come by a complete list (although texlive-core on Arch Linux is a good starting point)

project-repo avatar Jun 15 '19 06:06 project-repo

The manual has a list of the packages we currently depend on.

jgm avatar Jun 15 '19 17:06 jgm

That's long been there but I mean is there a need to whitelist a core set of LaTeX packages that is common enough that can be considered to be used in pandoc for any future discussions if something can be used or not.

ickc avatar Jun 15 '19 22:06 ickc

The following would be useful on the topic of good table typography in LaTeX: http://tex.stackexchange.com/questions/40542/why-not-use-vertical-lines-in-a-tabular

I don't want to disturb the gods of the holy typesetting, but I for one don't understand how "no row interruptions" could make something like a truth table (let alone if special) more readable. You aren't just supposed to read each row one by one... matching every cell with both of its "generators" seems what your aim is. And atm I get there's no way to get this from plain markdown.

mirh avatar Nov 27 '19 23:11 mirh

That was an old view of mine. My new view is there's specific needs for different specific situations, and unfortunately it is very difficult to satisfies everyone. In the case of LaTeX the users are granted infinite degree of freedom to accomplish whatever they want. In the case of pandoc for various reasons they have to choose a simple solution that satisfies most people's need. I think from the end users' perspective it may be good to at least provide more than 1 LaTeX table writer, user configurable (cli and/or YAML and/or attributes.) But it increases the developers' burden to maintain them. And if they choose to do so it is also their burden to pick the best combinations, etc. Also given the constraints on what LaTeX packages they want to depends on it is quite different to take a balance.

[I still think the booktabs style is a good defaults for good typography, but perhaps not for every table. The booktabs author would argue if you have a table that can't be represented nicely using booktabs that you may be presenting your information wrong (I just paraphrase from my old memory on what he write.) But sometimes (or most of the time!) people aren't publishing books and just want to get their document ready to share. And sometimes they don't even have control on the table which might be very large. IIRC there's a LaTeX package developed exactly when someone, perhaps sys admin, need to be able to produce PDF with big tables handed to them no matter what.]

My current view on the practice aspect of dealing with tables is, for complicated tables (or anything such as graph for that matter) that pandoc can't help you, just write filters whenever possible and practical. I have a pandoc filter called pantable because I often find it limiting using pandoc alone to process tables. In fact alone this line, it will be small effort to modify pantable to leverage some Python libraries to convert a table to LaTeX. But I just checked https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_latex.html and it generates a booktabs table so probably need to look elsewhere.

ickc avatar Nov 28 '19 03:11 ickc

I mean.. pandoc already supports 4 different table input formats, that's why I thought it wasn't a big deal to tweak/add them (even because after all, isn't this just about adding a simple | in the final latex code?) But I guess like you can't just make up styles mindlessly

p.s. a truth table doesn't sound that intricacy tbf

mirh avatar Nov 28 '19 12:11 mirh

So like many of us, I love pandoc and find that it helps me to get my work done. Sometimes a feature that seems like it should be trivial really isn't. If there were better separation of concerns in LaTeX when it comes to tables, we wouldn't be complaining about what pandoc does by default.

For a consulting project I'm doing, I have some seriously long tables that require both horizontal lines (\hline) and vertical lines (by using "|" to separate the field descriptors) to make tables comprehensible. I wrote a little Python script that can inject these by postprocessing the TeX generated by pandoc See https://gist.github.com/gkthiruvathukal/05e82bfbec79df32fc239b16ca4ef7d0. Please note that my tables requiring a vertical line between cells are either 2- or 3-columns wide. Adding a horizontal line works with any generated table.

At first blush, this sure seems like something that could be handled by some simple options on the TeX writer. But let's not forget that @jgm and others are volunteers who gave us these great tools to "complain" about in the first place, and there are many writers to support. (My original output format was Word, and it proved to be almost unusable for tables, since the relative widths are not in any way honored.) Based on the amount of code I had to write, I am sure glad I didn't have to do most of the heavy lifting that Pandoc does for me--out of the box--with minimal configuration and customization. With a little Python love, I can handle my needs in a relatively graceful way.

Of course, if we can get the Pandoc TeX writer to handle the nominal output of a simple vertical and horizontal line--both well-supported by LaTeX syntax--it would be pretty awesome. I'd be willing to try myself, if someone can explain or point me to docs on how to add options to the TeX/PDF writer.

gkthiruvathukal avatar Dec 25 '19 05:12 gkthiruvathukal

Unfortunately it's not just a matter of modifying the tex writer. The underlying data structure representing documents would need to be changed too (pandoc-types). And changes there would require changes in all writers and readers.

jgm avatar Dec 26 '19 21:12 jgm

I agree, @jgm, it would require significant effort--effort that might not be worth the trouble. And based on my explorations, postprocessing to address the common use-cases is fairly straightforward as a workaround. Thanks for the follow-up.

gkthiruvathukal avatar Dec 29 '19 03:12 gkthiruvathukal

Do the changes in March 2020 to pandoc-types (https://github.com/jgm/pandoc-types/pull/66) make a difference to this issue?

the-solipsist avatar Dec 03 '20 00:12 the-solipsist

Do the changes in March 2020 to pandoc-types (jgm/pandoc-types#66) make a difference to this issue?

I don't think so. Others can correct if I'm wrong. pandoc-types 1.21/22 is just more general AST. But we're talking about how the writer interpret that AST. e.g. you can consider a table valid in pandoc-types 1.20, i.e. not using any of the new features in 1.21/22, then it still should maps to the same LaTeX text.

But frankly I don't understand what changes @jgm was allude to. May be he mean there's nothing in the AST indicating if there's vertical lines or not. (If my interpretation is true, the current AST still don't have a way to encode that information, unless you want to hack the attributes to do that.)

My current view on the topic is that table is the "last battle ground" of pandoc. It is the one thing that a one-size-fits-all approach wouldn't work. (e.g. the very reason there's so many LaTeX packages dealing with tables.) So a more configurable approach to writing tables would be better.

By the way, I'm working on a Python library specialized in writing pandoc tables. It's not on my roadmap yet, but it has been on my mind to add different LaTeX writers for tables. e.g. pandoc has constraints on what LaTeX packages to depends on, and they currently only choose one. So more exotic LaTeX packages would be a better fit for filter.

ickc avatar Dec 03 '20 00:12 ickc